Compositions and methods for site directed genomic modification

ABSTRACT

The disclosure provides novel corn, tomato, and soybean U6, U3, U2, U5, and 7SL snRNA promoters which are useful for CRISPR/Cas-mediated targeted gene modifications in plants. The disclosure also provides methods for use for U6, U3, U2, U5, and 7SL promoters in driving expression of sgRNA polynucleotides which function in a CRISPR/Cas system of targeted gene modification in plants. The disclosure also provides methods of genome modification by insertion of blunt-end DNA fragments at a site of genomic cleavage.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a divisional of U.S. application Ser. No.15/120,110, filed Aug. 18, 2016, which is a ‘371 National Stageapplication of International Application Serial No. PCT/US2015/018104,filed Feb. 27, 2015, which claims the benefit of priority to U.S.Provisional Application Ser. No. 61/945,700, filed Feb. 27, 2014, theentire disclosures of which are incorporated herein by reference.

INCORPORATION OF SEQUENCE LISTING

The sequence listing that is contained in the file named“MONS350US-updated_ST25.txt”, which is 248 kilobytes (measured inMS-WINDOWS) and created on Sep. 26, 2017, is filed herewith byelectronic submission and incorporated herein by reference.

BACKGROUND Field

The disclosure relates to the field of biotechnology. More specifically,the disclosure provides a method of introducing recombinant blunt-enddouble-strand DNA fragments into the genome of a plant by introducing adouble-strand break in the genome and novel plant promoters beneficialfor the expression of, for instance, non-protein-coding small RNAs forCRISPR-mediated genome modification.

Description of Related Art

Site-specific recombination has potential for application across a widerange of biotechnology-related fields. Meganucleases, zinc fingernucleases (ZFNs), and transcription activator-like effector nucleases(TALENs) containing a DNA-binding domain and a DNA-cleavage domainenable genome modification. While meganucleases, ZFNs, and TALENs, areeffective and specific, these technologies require generation throughprotein engineering of one or more components for each genomic sitechosen for modification. Recent advances in application of clustered,regularly interspaced, short palindromic repeats (CRISPR) haveillustrated a method of genome modification that may be as robust as thecomparable systems (meganucleases, ZFNs, and TALENs), yet has theadvantage of being quick to engineer.

The Clustered Regularly Interspersed Short Palindromic Repeats (CRISPRs)system constitutes an adaptive immune system in prokaryotes that targetsendonucleolytic cleavage of invading phage. The system is composed of aprotein component (Cas) and a guide RNA (gRNA) that targets the proteinto a specific locus for endonucleolytic cleavage. This system has beensuccessfully engineered to target specific loci for endonucleolyticcleavage of mammalian, zebrafish, drosophila, nematode, bacteria, yeast,and plant genomes.

SUMMARY

In one aspect the invention provides a recombinant DNA constructcomprising a snRNA promoter selected from the group consisting of: a U6promoter, a U3 promoter, a U2 promoter, a U5 promoter, and a 7SLpromoter; operably linked to a sequence encoding a single-guide RNA(sgRNA), wherein the sequence of said snRNA promoter comprises SEQ IDNO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6,SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ IDNO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NOs:146-149, SEQID NOs:160-201, or SEQ ID NOs:247-283; or a fragment thereof, whereinthe fragment is at least 140 bp in length.

In one embodiment the sequence of said U6 promoter may comprise any ofSEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ IDNO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11,SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16,SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ IDNOs:146-149, SEQ ID NOs:160-166, SEQ ID NOs:200-201, or SEQ ID NO:283,or a fragment thereof, wherein the fragment is at least 140 bp inlength. In a further embodiment, the sequence of said U6 promoter maycomprise SEQ ID NO:7. In another embodiment the sequence of said U6promoter may comprise a sequence selected from the group consisting of:SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20. In yetanother embodiment the sequence of said U3 promoter may comprise any ofSEQ ID NOs:167-171 or SEQ ID NOs:178-182, or a fragment thereof; whereinthe fragment is at least 140 bp in length. In still yet anotherembodiment the sequence of said U2 promoter comprises any of SEQ IDNOs:183-187, SEQ ID NOs:192-199, or SEQ ID NOs:247-275, or a fragmentthereof; wherein the fragment is at least 140 bp in length. In anotherembodiment the sequence of said U5 promoter comprises any of SEQ IDNOs:188-191, or SEQ ID NOs:276-282, or a fragment thereof; wherein thefragment is at least 140 bp in length. In a further embodiment thesequence of said 7SL promoter comprises any of SEQ ID NOs:172-177, or afragment thereof; wherein the fragment is at least 140 bp in length. Therecombinant DNA construct may further comprise a transcriptiontermination sequence.

The recombinant DNA construct may also further comprise a sequenceencoding a promoter operably linked to a sequence encoding a clustered,regularly interspaced, short palindromic repeats (CRISPR)-associated Casendonuclease gene product. In certain embodiments of the recombinant DNAconstruct, the Cas endonuclease gene product may be further operablylinked to a nuclear localization sequence (NLS). Further, in certainembodiments of the contemplated recombinant DNA construct, the sequenceencoding said Cas endonuclease may be selected from the group consistingof SEQ ID NO:27, SEQ ID NO:68, and SEQ ID NO:97, SEQ ID NO:119, and SEQID NO:136.

Another aspect of the invention provides a recombinant DNA constructcomprising a snRNA promoter selected from the group consisting of: a U6promoter, a U3 promoter, a U2 promoter, a U5 promoter, and a 7SLpromoter; operably linked to a sequence specifying a non-coding RNA,wherein the sequence of said snRNA promoter comprises SEQ ID NO:1, SEQID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ IDNO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ IDNO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ IDNO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NOs:146-149, SEQID NOs:160-201 or SEQ ID NOs:247-283, or a fragment thereof, wherein thefragment is at least 140 bp in length. In some embodiments thenon-coding RNA is selected from the group consisting of: a microRNA(miRNA), a miRNA precursor, a small interfering RNA (siRNA), a small RNA(22-26 nt in length) and precursor encoding same, a heterochromaticsiRNA (hc-siRNA), a Piwi-interacting RNA (piRNA), a hairpin doublestrand RNA (hairpin dsRNA), a trans-acting siRNA (ta-siRNA), and anaturally occurring antisense siRNA (nat-siRNA).

Certain embodiments if the invention further comprise such a recombinantDNA construct, wherein the sequence of said U3 promoter comprises any ofSEQ ID NOs:167-171 and SEQ ID NOs:178-182, or a fragment thereof;wherein the fragment is at least 140 bp in length. In another embodimentof the recombinant DNA construct, the sequence of said U2 promotercomprises any of SEQ ID NOs:183-187, SEQ ID NOs:192-199, or SEQ IDNOs:247-275, or a fragment thereof; wherein the fragment is at least 140bp in length. In yet another embodiment of the recombinant DNAconstruct, the sequence of said U5 promoter comprises any of SEQ IDNOs:188-191, or SEQ ID NOs:276-282, or a fragment thereof; wherein thefragment is at least 140 bp in length. Still further, the inventionprovides an embodiment wherein the sequence of said U6 promoter maycomprise any of SEQ ID NOs:1-20, SEQ ID NOs:146-149, SEQ ID NOs:160-166,SEQ ID NOs:200-201, or SEQ ID NO:283, or a fragment thereof; wherein thefragment is at least 140 bp in length. Another embodiment comprises therecombinant DNA construct wherein the sequence of said 7SL promotercomprises any of SEQ ID NOs:172-177, or a fragment thereof; wherein thefragment is at least 140 bp in length.

Another aspect of the invention provides a cell comprising a recombinantDNA construct as described above. In certain embodiments the cell is aplant cell.

The invention further provides a method of introducing a double-strandbreak in the genome of a cell, comprising introducing in said cell: a)at least one recombinant DNA construct of claim 1; and b) a secondrecombinant DNA construct comprising a sequence encoding a promoteroperably linked to a sequence encoding a clustered, regularlyinterspaced, short palindromic repeats (CRISPR)-associated Casendonuclease gene product operably linked to a nuclear localizationsequence (NLS). In one embodiment of such a method, the sequence of theU6 promoter comprises SEQ ID NO:7. In another embodiment of the method,the U6 promoter comprises a sequence selected from the group consistingof: SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20. In yetanother embodiment of the method, the sequence encoding said Casendonuclease is selected from the group consisting of SEQ ID NO:27, SEQID NO:68, and SEQ ID NO:97, SEQ ID NO:119, and SEQ ID NO:136.

The invention further provides a method of introducing a double-strandbreak in the genome of a cell, comprising introducing to said cell atleast one recombinant DNA construct which comprises a recombinant DNAconstruct comprising a snRNA promoter selected from the group consistingof: a U6 promoter, a U3 promoter, a U2 promoter, a U5 promoter, and a7SL promoter; operably linked to a sequence encoding a single-guide RNA(sgRNA), wherein the sequence of said snRNA promoter comprises SEQ IDNO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6,SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ IDNO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NOs:146-149, SEQID NOs:160-201, or SEQ ID NOs:247-283; or a fragment thereof, whereinthe fragment is at least 140 bp in length, and also further comprises asequence encoding a promoter operably linked to a sequence encoding aclustered, regularly interspaced, short palindromic repeats(CRISPR)-associated Cas endonuclease gene product.

In certain embodiments of the method, the sequence of said U6 promotercomprises SEQ ID NO:7. In other embodiments the U6 promoter comprises asequence selected from the group consisting of: SEQ ID NO:17, SEQ IDNO:18, SEQ ID NO:19, and SEQ ID NO:20. In some embodiments of the methodthe sequence encoding the Cas endonuclease is selected from the groupconsisting of SEQ ID NO:27, SEQ ID NO:68, and SEQ ID NO:97, SEQ IDNO:119, and SEQ ID NO:136.

Another aspect of the invention provides a method of genome modificationcomprising: a) introducing a double-strand break at a selected site inthe genome of a plant cell, and b) introducing into said plant cell arecombinant blunt-end double-strand DNA fragment, wherein saidrecombinant blunt-end double-strand DNA fragment is incorporated intosaid double strand break by endogenous DNA repair. The method maycomprise genome modification such as production of a modified linkageblock, linking two or more QTLs, disrupting linkage of two or more QTLs,gene insertion, gene replacement, gene conversion, deleting ordisrupting a gene, transgenic event selection, transgenic trait donorselection, transgene replacement, or targeted insertion of at least onenucleic acid of interest. In some embodiments of the method the doublestranded break is introduced by an endonuclease. In certain embodimentsthe endonuclease may be selected from the group consisting of: a TALENendonuclease; a CRISPR endonuclease; a meganuclease comprising a“LAGLIDADG,” (SEQ ID NO:284) “GIY-YIG,” “His-Cys box,” or HNH sequencemotif; and a Zinc finger nuclease. In particular embodiments theendonuclease is a TALEN endonuclease and TALEN expression constructs areintroduced into the plant cell, wherein about 0.1 pmol of each TALENexpression construct is introduced into the plant cell.

Further, in the method the plant cell may be a protoplast or may havebeen, or is being, grown in a plant cell culture. In certain embodimentsof the method the plant cell is selected from the group consisting of: asoybean plant cell; a corn plant cell; a rice plant cell; a wheat plantcell; a turfgrass plant cell; a cotton plant cell; and a canola plantcell. In other embodiments of the method the recombinant blunt-enddouble-strand DNA fragment does not comprise a region of homology to theselected site in the genome.

Embodiments of the method are contemplated wherein about 0.03 to about0.3 fmol of recombinant blunt-end double-strand DNA fragment isintroduced into said plant cell. In particular embodiments about 0.15fmol of recombinant blunt-end double-strand DNA fragment is introducedinto said plant cell. Further, the blunt-end double-strand DNA fragmentmay comprise on the 5′ end, or the 3′ end, or both the 5′ and 3′ ends, aregion with microhomology to a sequence comprising one or both ends ofsaid double-strand break in the genome. Some embodiments comprise amethod wherein the region of microhomology is selected from a sequence 1bp, 2 bp, 3 bp, 4, bp, 5 bp, 6 bp, 7 bp, 8 bp, 9 bp, or 10 bp in length.In a particular embodiment of the method the region of microhomology is3 bp in length.

The method may comprise introduction of a double-strand break in step a)as described above, by providing said cell with an endonuclease designedto target a selected target site in the genome of said cell. Further,the endonuclease may be provided by at least one recombinant DNAconstruct encoding the endonuclease. In an embodiment, the endonucleaseis provided by delivering an mRNA encoding the endonuclease or theendonuclease to the plant cell. In particular embodiments Theendonuclease is selected from the group consisting of: a TALENendonuclease; a Zinc finger endonuclease; a meganuclease; and a CRISPRendonuclease. Additional embodiments may comprise introduction of adouble-strand break in step a) by providing said cell with a recombinantDNA construct encoding a promoter operably linked to a sequence encodinga clustered, regularly interspaced, short palindromic repeats(CRISPR)-associated Cas endonuclease gene product and a recombinant DNAconstruct comprising a U6, U3, U2, U5, or 7SL promoter operably linkedto a sequence encoding a single-guide RNA (sgRNA) designed to target aselected target site in the chromosome of said cell. In particularembodiments the Cas endonuclease gene product may be further operablylinked to at least one nuclear localization sequence (NLS).

In certain embodiments of the method the sequence of said U6, U3, U2,U5, or 7SL promoter may comprise SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3,SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ IDNO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ IDNO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ IDNO:19, SEQ ID NO:20, SEQ ID NOs:146-149, SEQ ID NOs:160-201 or SEQ IDNOs:247-283, or a fragment thereof; wherein the fragment is at least 140bp in length and comprises a transcription termination sequence. Inparticular embodiments the U6 promoter may comprise a sequence selectedfrom the group consisting of: SEQ ID NOs:1-20, SEQ ID NOs:146-149, SEQID NOs:160-166, SEQ ID NOs:200-201, and SEQ ID NO:283, or a fragmentthereof; wherein the fragment is at least 140 bp in length comprising atranscription termination sequence. In alternative embodiments the U6promoter may comprise a sequence selected from the group consisting of:SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20. In furtherembodiments the sequence of said U3 promoter may comprise any of SEQ IDNOs:167-171 or SEQ ID NOs:178-182, or a fragment thereof; wherein thefragment is at least 140 bp in length. In still further embodiments thesequence of said U5 promoter comprises any of SEQ ID NOs:188-191, or SEQID NOs:276-282, or a fragment thereof; wherein the fragment is at least140 bp in length. Additionally, the sequence of said U2 promoter maycomprise any of SEQ ID NOs:183-187, SEQ ID NOs:192-199, or SEQ IDNOs:247-275, or a fragment thereof; wherein the fragment is at least 140bp in length. In yet other embodiments the sequence of said 7SL promotercomprises any of SEQ ID NOs:172-177, or a fragment thereof, wherein thefragment is at least 140 bp in length.

Embodiments are also contemplated wherein the recombinant DNA constructencoding a promoter operably linked to a sequence encoding a clustered,regularly interspaced, short palindromic repeats (CRISPR)-associated Casendonuclease gene product, and the recombinant DNA construct comprisinga U6, U3, U2, U5, or 7SL promoter operably linked to a sequence encodinga single-guide RNA (sgRNA) is designed to target a selected target sitein the chromosome of said cell, are on the same construct. Otherembodiments of the method may comprise use of a recombinant DNAconstruct encoding a promoter operably linked to a sequence encoding aclustered, regularly interspaced, short palindromic repeats(CRISPR)-associated Cas endonuclease gene product and the recombinantDNA construct comprising a U6, U3, U2, U5, or 7SL promoter is operablylinked to a sequence encoding a single-guide RNA (sgRNA) designed totarget a selected target site in the chromosome of said cell are on atleast two constructs.

A further aspect of the invention comprises a plant cell comprising atargeted recombinant sited-directed integration of a blunt-enddouble-strand DNA fragment. Further provided are a plant, plant part, orplant seed comprising a targeted recombinant sited-directed integrationof a blunt-end double-strand DNA fragment.

A still further aspect of the invention comprises: a method of genomemodification comprising: a) introducing a double-strand break in thegenome of a plant cell by introducing a double-strand break in thegenome of a cell, comprising introducing in said cell: a) at least onerecombinant DNA construct of claim 1; and b) a second recombinant DNAconstruct comprising a sequence encoding a promoter operably linked to asequence encoding a clustered, regularly interspaced, short palindromicrepeats (CRISPR)-associated Cas endonuclease gene product operablylinked to a nuclear localization sequence (NLS); and b) introducing intosaid plant cell a recombinant blunt-end double-strand DNA fragment,wherein said recombinant blunt-end double-strand DNA fragment isincorporated into said double strand break by endogenous DNA repair.

A further aspect of the invention comprises a method of genomemodification comprising: a) introducing a double-strand break in thegenome of a plant cell as described above, and b) introducing into saidplant cell a recombinant blunt-end double-strand DNA fragment, whereinsaid recombinant blunt-end double-strand DNA fragment is incorporatedinto said double strand break by endogenous DNA repair.

Yet another aspect of the invention comprises a recombinant DNAconstruct comprising at least a first expression cassette comprising aU6, U3, U2, U5, or 7SL promoter operably linked to a sequence encoding asingle-guide RNA (sgRNA), wherein the sequence of said promotercomprises any of SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ IDNO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ IDNO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ IDNO:20, SEQ ID NOs:146-149, SEQ ID NOs:160-201, or SEQ ID NOs:247-283, ora fragment thereof; wherein the fragment is at least 140 bp in length.In certain embodiments the recombinant DNA construct further comprisesat least a second expression cassette, wherein the sequence encoding thefirst sgRNA is distinct from the sequence encoding the second sgRNA. Therecombinant DNA construct may also comprise a construct wherein thepromoter operably linked to the sequence encoding the first sgRNA isdistinct from the promoter operably linked to the sequence encoding thesecond sgRNA. In certain embodiments the construct comprises flankingleft and right homology arms (HA) which are each about 200-1200 bp inlength. In particular embodiments the homology arms are about 230 toabout 1003 bp in length.

Another aspect of the invention provides a method of quantifying theactivity of a nuclease by detecting integrated DNA fragments bydetermining the rate of homologous recombination (HR) mediated targetedintegration by use of using digital PCR or quantitative PCR.

Yet another aspect of the invention comprises a recombinant DNAconstruct comprising: a) a first snRNA promoter selected from the groupconsisting of: a U6 promoter, a U3 promoter, a U2 promoter, a U5promoter, and a 7SL promoter; operably linked to a sequence encoding anon-coding RNA, and b) a second snRNA promoter selected from the groupconsisting of: a U6 promoter, a U3 promoter, a U2 promoter, a U5promoter, and a 7SL promoter; operably linked to a sequence encoding anon-coding RNA, wherein the first snRNA promoter and the second snRNApromoter are different. In certain embodiments the sequence encoding thefirst snRNA promoter and the sequence encoding the second snRNA promotereach comprise SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ IDNO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8, SEQ ID NO:9, SEQ ID NO:10,SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15,SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20,SEQ ID NOs:146-149, SEQ ID NOs:160-201, or SEQ ID NOs:247-283, or afragment thereof; wherein the fragment is at least 140 bp in length.Further, a recombinant DNA construct, wherein the first and second snRNApromoter are U6 promoters and the sequences encoding the first andsecond snRNA promoters are each selected from the group consisting of:SEQ ID NOs:1-8, SEQ ID NOs:17-20, and SEQ ID NOs:200-201 is alsoprovided in certain embodiments.

Thus, a recombinant DNA construct wherein the first and second snRNApromoter are U6 promoters and the sequences encoding the first andsecond snRNA promoters are each selected from the group consisting of:SEQ ID NOs:12-16, SEQ ID NOs:160-166, and SEQ ID NO:283, is alsoprovided. Alternatively, a recombinant DNA construct, wherein the firstand second snRNA promoter are U6 promoters and the sequences encodingthe first and second snRNA promoters are each selected from the groupconsisting of: SEQ ID NOs:9-11 and SEQ ID NO:146-149, is provided.

A recombinant DNA construct, wherein the first and second snRNA promoterare U2 promoters and the sequences encoding the first and second snRNApromoters are each selected from the group consisting of SEQ IDNOs:183-187 and SEQ ID NOs:192-199 is also contemplated. Additionally,certain embodiments of the invention comprise a recombinant DNAconstruct wherein the first and second snRNA promoter are U2 promotersand the sequences encoding the first and second snRNA promoters are eachselected from the group consisting of SEQ ID NOs:247-275.

Yet other embodiments comprise a recombinant DNA construct, wherein thefirst and second snRNA promoter are U3 promoters and the sequencesencoding the first and second snRNA promoters are each selected from thegroup consisting of SEQ ID NOs:178-182. Still other embodiments of theinvention comprise a recombinant DNA construct, wherein the first andsecond snRNA promoter are U3 promoters and the sequences encoding thefirst and second snRNA promoters are each selected from the groupconsisting of SEQ ID NOs:167-171.

Alternatively, the recombinant DNA construct may comprise first andsecond snRNA promoter which are U5 promoters and wherein the sequencesencoding the first and second snRNA promoters are each selected from thegroup consisting of SEQ ID NOs:188-191. Alternatively provided arerecombinant DNA constructs wherein the first and second snRNA promoterare U5 promoters and the sequences encoding the first and second snRNApromoters are each selected from the group consisting of SEQ IDNOs:276-282.

Certain embodiments of the invention provide a recombinant DNA constructwherein the first and second snRNA promoter are 7SL promoters and thesequences encoding the first and second snRNA promoters are eachselected from the group consisting of SEQ ID NOs:175-177. In otherembodiments the recombinant DNA construct wherein the first and secondsnRNA promoter are 7SL promoters and the sequences encoding the firstand second snRNA promoters are each selected from the group consistingof SEQ ID NOs:172-174.

Also contemplated are embodiments wherein the recombinant DNA constructcomprises a first snRNA promoter which is a U6 promoter and a secondsnRNA promoter is also present and is selected from the group consistingof: a U3 promoter, a U2 promoter, a U5 promoter, and a 7SL promoter.Other embodiments include a recombinant DNA construct wherein the firstsnRNA promoter is a U3 promoter and the second snRNA promoter isselected from the group consisting of: a U6 promoter, a U2 promoter, aU5 promoter, and a 7SL promoter. Alternatively in the recombinant DNAconstruct, the first snRNA promoter is a U2 promoter and the secondsnRNA promoter may be selected from the group consisting of: a U6promoter, a U3 promoter, a U5 promoter, and a 7SL promoter; or the firstsnRNA promoter is a U5 promoter and the second snRNA promoter isselected from the group consisting of: a U6 promoter, a U2 promoter, aU3 promoter, and a 7SL promoter. Further, the recombinant DNA constructmay comprise a first snRNA promoter which is a 7SL promoter and thesecond snRNA promoter may be selected from the group consisting of: a U6promoter, a U2 promoter, a U3 promoter, and a U5 promoter.

Other contemplated embodiments of the invention include a recombinantDNA construct as described above, wherein the sequences encoding thefirst and second snRNA promoters are each selected from the groupconsisting of: SEQ ID NOs:1-8, SEQ ID NOs:17-20, SEQ ID NOs:200-201, SEQID NOs:183-187, SEQ ID NOs:192-199, SEQ ID NOs:178-182, SEQ IDNOs:188-191, and SEQ ID NOs:175-177. In certain embodiments of therecombinant DNA construct, the sequences encoding the first and secondsnRNA promoters are each selected from the group consisting of: SEQ IDNOs:12-16, SEQ ID NOs:160-166, SEQ ID NO:283, SEQ ID NOs:247-275, SEQ IDNOs:167-171, SEQ ID NOs:276-282, and SEQ ID NOs:172-174.

The recombinant DNA construct may further comprise a sequence specifyingone or more additional snRNA promoters selected from the groupconsisting of: a U6 promoter, a U3 promoter, a U2 promoter, a U5promoter, and a 7SL promoter; operably linked to a sequence encoding anon-coding RNA, wherein the first snRNA promoter, the second snRNApromoter, and each of the one or more additional snRNA promoters aredifferent. In particular embodiments of the recombinant DNA construct,the sequence specifying said one or more additional snRNA promoters isselected from the group consisting of: SEQ ID NO:1, SEQ ID NO:2, SEQ IDNO:3, SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, SEQ ID NO:8,SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ ID NO:17, SEQ ID NO:18, SEQ IDNO:19, SEQ ID NO:20, SEQ ID NOs:146-149, SEQ ID NOs:160-201, or SEQ IDNOs:247-283; or a fragment thereof, wherein the fragment is at least 140bp in length. Further, the recombinant DNA construct may comprise 3, 4,5, 6, 7, 8, 9 or 10 snRNA promoters.

In some embodiments of the recombinant DNA construct, the non-codingRNAs are sgRNAs targeting different selected target sites in achromosome of a plant cell. The recombinant DNA constructs may furthercomprise a sequence encoding a promoter operably linked to a sequenceencoding a clustered, regularly interspaced, short palindromic repeats(CRISPR)-associated Cas endonuclease gene product.

Yet another aspect of the invention provides a method of genomemodification comprising: a) introducing double-strand breaks at two ormore selected sites in the genome of a plant cell by providing said cellwith a clustered, regularly interspaced, short palindromic repeats(CRISPR)-associated Cas endonuclease and a recombinant DNA constructwherein the non-coding RNAs are sgRNAs targeting different selectedtarget sites in a chromosome of a plant cell, and b) introducing intosaid plant cell one or more exogenous double-strand DNA fragment;wherein said exogenous double-strand DNA fragments are incorporated intosaid double strand breaks by endogenous DNA repair. In some embodimentssaid one or more exogenous double-strand DNA fragments are blunt-ended.In certain embodiments of the method, said one or more exogenousdouble-strand DNA fragments comprise a region of homology to a selectedsite in the genome. In other embodiments the exogenous double-strand DNAfragments comprise regions of homology to different selected sites inthe genome.

BRIEF DESCRIPTION OF THE DRAWINGS

The following drawings form part of the present specification and areincluded to further demonstrate certain aspects of the presentdisclosure. The disclosure may be better understood by reference to oneor more of these drawings in combination with the detailed descriptionof specific embodiments presented herein.

FIGS. 1A-1B: Nucleotide sequence alignment of four native corn U6 smallnuclear RNA (snRNA) genes, including their putative promoters fromchromosomes 1, 2, 3, and 8. (FIG. 1A) and (FIG. 1B) The sequenceconsensus, and (SEQ ID NOs:285-292) percent conservation are presentedbelow the alignments. (FIG. 1B) The thick arrow indicates thetranscription start site; upstream from the transcriptional start siteare a ‘TATA Box’, an Upstream Sequence Element (USE), andMonocot-Specific Promoter (MSP) elements, each marked with heavy linedboxes; the stretch of seven thymidine bases (poly-T) at the 3′ end isthe transcription termination signal. The sequences in FIG. 1 .A andFIG. 1 .B correspond the following: ZmU6_Ch1 represented by SEQ IDNO:98; ZmU6_Ch2 represented by SEQ ID NO:99; ZmU6_Ch3 represented by SEQID NO:100; ZmU6_Ch8 represented by SEQ ID NO:101.

FIG. 2 : Illustration of a modified GUS (β-glucuronidase) reporter geneharboring a direct repeat of the coding sequence (GUUS) interrupted by atarget site (TS) for CRISPR cleavage.

FIG. 3 : GUS activities detected in corn callus after co-bombardment ofa GUUS reporter construct together with CRISPR constructs designed forintroducing a double-stranded break (DSB) at the Zm7 genomic targetsite.

FIG. 4 : GUS activity detected in corn callus after co-bombardment of aGUUS reporter construct together with CRISPR constructs designed forintroducing a DSB at the Zm231 genomic target site. A different genomictarget and single-guide RNA (sgRNA) spacer sequence, Zm14, were used asnegative control. Also shown are fluorescence microscopy images ofrepresentative calli which were co-bombarded with a green fluorescentprotein (GFP) expression vector with the GUUS reporter construct, Cas9expression vector and vectors containing the various sgRNA cassettes.

FIGS. 5A-5E: Illustrations of (FIG. 5A) oligonucleotide integrationassay; (FIG. 5B) blunt-end oligonucleotide without microhomology usedfor insertion at a corn genomic target site (SEQ ID NO:293 pre-insertionand SEQ ID NOs:294 and 295 after insertion; (FIG. 5C) blunt-endoligonucleotide with microhomology ends used for insertion at a corngenomic target site (SEQ ID NO:293 pre-insertion and SEQ ID NOs:294 and295 after insertion; (FIG. 5D) fragment analysis profile of PCRamplicons spanning the oligo-chromosome junction in test (upper panel)and negative control samples (bottom panel) of the oligonucleotideintegration assay (where the arrow indicates the expected peak); and(FIG. 5E) DNA sequences of oligonucleotide-chromosome junctions (SEQ IDNOs:294 and 295) at the Zm_L70c corn genomic target site confirmingintegrations of both full-length (integration 1; SEQ ID NO:103) andtruncated oligonucleotides (integration 2; SEQ ID NO:104), the expectedsequence (template) is presented as SEQ ID NO:102.

FIG. 6 : Illustration of a sgRNA including a spacer sequencecomplementary to a native corn genomic target site and an artificialloop (5′-CCAAAAGG-3′; SEQ ID NO:105) and its predicted secondarystructure designed for Streptococcus thermophilus Cas9-mediatedtargeting (SEQ ID NO:110).

FIGS. 7A-7B: Illustrations of (FIG. 7A) selectable marker gene removalby multiplex CRISPR activity following targeted integration of the geneof interest (GOI); and (FIG. 7B) a CRISPR/Cas multiplex system toevaluate gene linkage of multiple QTL candidate genes. Where likelihoodof odds (LOD) is a statistical measure for genetic linkage; an LOD of 3means that it is 1000× more likely that a QTL exists in the intervalthan that there is no QTL.

FIG. 8 . Graphical presentation of data showing percentage targetedintegration rates (Y-axis) detected at 24 and 48 hourspost-transformation of corn protoplasts using CRISPR constructstargeting a native chromosomal target (Zm7) in corn and a titration ofthe pmol of blunt-end, double-stranded DNA fragment added to thetransfection mixture (X-axis). The negative controls were run withoutadded Cas9 expression constructs.

FIG. 9 . Graphical presentation of the integration rate (Y-axis) as afunction of the amount (in pmol) of SpCas9 expression construct added totransfection mixture of corn protoplasts (X-axis).

FIGS. 10A-10C. Sequence confirmation for targeted integrations ofblunt-end, double-strand DNA fragments into chromosomes of cornprotoplasts transformed with CRISPR/Cas9 and sgRNA expressionconstructs. For all panels FIGS. 10A, 10B, 10C, the top sequence is theexpected sequence of one junction of the target site and the blunt-enddouble-strand DNA fragment (underlined sequence) included in theexperiment. FIG. 10A. Corn chromosome site Zm7 targeted by CRISPR/Cas9constructs and with blunt-end double-strand DNA fragment formed byannealed DNA fragments represented by SEQ ID NO:115 and SEQ ID NO:116.FIG. 10B. Corn chromosome site L70c targeted by CRISPR/Cas9 constructsand with blunt-end double-strand DNA fragment without micro-homologysequences formed by annealed DNA fragments represented by SEQ ID NO:45and SEQ ID NO:46. FIG. 10C. Corn chromosome site L70c targeted byCRISPR/Cas9 constructs and with blunt-end double-strand DNA fragmentwith 3 bp micro-homology sequences at each end of the DNA fragmentformed by annealed DNA fragments represented by SEQ ID NO:121 and SEQ IDNO:122.

FIG. 11 . Graphical presentation of the integration rate (Y-axis) as afunction of the amount (in pmol) of TALEN expression constructstargeting corn chromosome site L70.4 which were added to transfectionmixture of corn protoplasts (X-axis).

FIGS. 12A-12B. Schematic representation of NHEJ and HR-mediated targetedintegration and PCR primer positions for high through-put screening.Targeted integration of a DNA fragment by non-homologous end-joining(NHEJ) is presented in FIG. 12A and targeted integration of a DNAfragment by homologous recombination (HR) is presented in FIG. 12B.

FIGS. 13A-13B. Schematic representation of the constructs used forhomologous integration. The blunt-end DNA arrow indicates the 90 bpsequence corresponding to the 90 bp blunt-end, double-strand DNAfragment used for NHEJ assays, LHA refers to left-homology arm, RHArefers to right-homology arm, Zm7 refers to the target site Zm7 targetedby a CRISPR/Cas9+sgRNA. The length in bp of each of the homology arms isindicated. FIG. 13A. Schematic for HR-cassette construct for targetingthe corn chromosome site Zm7 with LHA and RHA of 240 and 230 bp inlength, respectively. FIG. 13B. Schematic for HR-cassette construct fortargeting the corn chromosome site Zm7 with LHA and RHA of 240 and 1003bp in length, respectively.

FIGS. 14A-14B. Schematic representation of the constructs used forhomologous integration. In the figure, blunt-end DNA arrow indicates the90 bp sequence corresponding to the 90 bp blunt-end, double-strand DNAfragment used for NHEJ assays, LHA refers to left-homology arm, RHArefers to right-homology arm, L70.4 refers to the target site L70.4 inthe corn chromosome targeted by a TALEN pair. The length in bp of eachof the homology arms is indicated. FIG. 14A. Schematic for HR-cassetteconstruct for targeting the corn chromosome site L70.4 with both the LHAand RHA 230 bp in length. FIG. 14B. Schematic for HR-cassette constructfor targeting the corn chromosome site L70.4 with LHA and RHA of 1027 bpand 230 bp in length, respectively.

FIGS. 15A-15B. FIG. 15A. Graphical presentation of data showing percenttargeted integration rates in transfected corn protoplasts using StCas9CRISPR constructs targeting native corn chromosomal target sites L70e,L70f, and L70g. The controls lacked a StCas9 expression cassetteconstruct in the transfection mixture. FIG. 15B. Sequence alignment ofexpected integration of the blunt-end, double-strand DNA fragment at theL70f target site (SEQ ID NO:144) and one example of target siteintegration with indel of the DNA fragment sequence (SEQ ID NO:145).

FIGS. 16A-16B. FIG. 16A. Chromosomal integration rates using constructswith the corn chromosome 8 U6 promoter or one of three separate chimericU6 promoters driving sgRNA expression in CRISPR/Cas9 system to targetthree different corn chromosomal target sites. Targeted integration wasmeasured by ddPCR assay using MGB TaqMan probes. FIG. 16B. Chromosomalintegration rates using constructs with the corn chromosome 8 U6promoter or one of three separate chimeric U6 promoters driving sgRNAexpression in CRISPR/Cas9 system to target three different cornchromosomal target sites. Targeted integration was measured by ddPCRassay using EvaGreen® intercalating dye.

FIGS. 17A-17C. FIG. 17A. Schematic of PCR screening strategy to detectCRISPR/Cas9 induced mutation by NHEJ at tomato invertase inhibitortarget site 2 (TS2), resulting in mutation of restriction endonucleasesite Sm1I. FIG. 17B. Photograph of PCR amplicons run on an agarose gelshowing undigested amplicons and Sm1I digested amplicons to detectCRISPR/Cas9 induced mutation at tomato invertase inhibitor target site2. FIG. 17C. Multiple sequence alignment of sequences of PCR ampliconsfrom CRISPR/Cas9 induced mutation by NHEJ at the tomato invertaseinhibitor target site 2.

FIGS. 18A-18B. FIG. 18A. Graphical representation of data showingnormalized GUS mRNA levels from soybean cotyledon protoplast assays withrecombinant expression constructs with U6, U3, and 7SL promoters. FIG.18B. Graphical representation of data showing normalized GUS mRNA levelsfrom corn leaf protoplast assays with recombinant expression constructswith U6, U3, 7SL, U2, or U5 promoters.

FIG. 19 . Graphical representation of data from normalized GUSexpression levels from corn leaf protoplast assays with, a recombinantexpression constructs encoding 1) a GUS expression construct 2) a deadCas9-TALE-AD expression construct, and 3) recombinant sgRNA expressionconstructs with 7SL, U6, U3, U2, or U5 promoters.

DETAILED DESCRIPTION

The disclosure provides novel promoters from Zea mays and other plants,and methods for their use that include targeted gene modification of aplant genome using transgenic expression of a gene, or genes, involvedin the Clustered Regularly Interspersed Short Palindromic Repeats(CRISPR) system found in many bacteria. For instance, the disclosureprovides, in one embodiment, DNA constructs encoding at least oneexpression cassette including a U6 promoter disclosed herein and asequence encoding a single-guide RNA (sgRNA). Methods for causing aCRISPR system to modify a target genome are also provided, as are thegenomic complements of a plant modified by the use of such a system. Thedisclosure thus provides tools and methods that allow one to insert,remove, or modify genes, loci, linkage blocks, and chromosomes within aplant. Also disclosed are U3, U2, U5 and 7SL promoters and methods fortheir use that include targeted gene modification of a plant genome.

The disclosure provides, in another embodiment, DNA constructs encodingat least one expression cassette including a promoter disclosed hereinand a sequence encoding a non-protein-coding small RNA (npcRNA). Theseconstructs are useful for targeting nuclear expression of the npcRNAmolecules.

The CRISPR system constitutes an adaptive immune system in prokaryotesthat targets endonucleolytic cleavage of the DNA and RNA of invadingphage (reviewed in Westra et al., Annu Rev Genet, 46:311-39, 2012).There are three known types of CRISPR systems, Type I, Type II, and TypeIII. The CRISPR systems rely on small RNAs for sequence-specificdetection and targeting of foreign nucleic acids for destruction. Thecomponents of the bacterial CRISPR systems are CRISPR-associated (Cas)genes and CRISPR array(s) consisting of genome-target sequences(protospacers) interspersed with short palindromic repeats.Transcription of the protospacer/repeat elements into precursor CRISPRRNA (pre-crRNA) molecules is followed by enzymatic cleavage triggered byhybridization between a trans-acting CRISPR RNA (tracrRNA) molecule anda pre-crRNA palindromic repeat. The resulting crRNA:tracrRNA molecules,consisting of one copy of the spacer and one repeat, complex with a Casnuclease. The CRISPR/Cas complex is then directed to DNA sequences(protospacer) complementary to the crRNA spacer sequence, where thisRNA-Cas protein complex silences the target DNA through enzymaticcleavage of both strands (double-strand break; DSB).

The native bacterial type II CRISPR system requires four molecularcomponents for targeted cleavage of exogenous DNAs: a Cas endonuclease(e.g., Cas9), the house-keeping RNaseIII, CRISPR RNA (crRNA) andtrans-acting CRISPR RNA (tracrRNA). The latter two components form adsRNA complex and bind to Cas9 resulting in an RNA-guided DNAendonuclease complex. For targeted genome modifications in eukaryotes,this system was simplified to two components: the Cas9 endonuclease anda chimeric crRNA-tracrRNA, called guide-RNA (gRNA) or, alternatively,single-guide RNA (sgRNA). Experiments initially conducted in eukaryoticsystems determined that the RNaseIII component was not necessary toachieve targeted DNA cleavage. The minimal two component system of Cas9with the sgRNA, as the only unique component, enables this CRISPR systemof targeted genome modification to be more cost effective and flexiblethan other targeting platforms such as meganucleases, Zn-fingernucleases, or TALE-nucleases which require protein engineering formodification at each targeted DNA site. Additionally, the ease of designand production of sgRNAs provides the CRISPR system with severaladvantages for application of targeted genome modification. For example,the CRISPR/Cas complex components (Cas endonuclease, sgRNA, and,optionally, exogenous DNA for integration into the genome) designed forone or more genomic target sites can be multiplexed in onetransformation, or the introduction of the CRISPR/Cas complex componentscan be spatially and/or temporally separated.

Expression Strategies for sgRNAs

The disclosure provides, in certain embodiments, novel combinations ofpromoters and a sequence encoding a sgRNA, to allow for specificallyintroducing a double-stranded DNA cleavage event into endogenous DNA(i.e., a genome). In one embodiment, a U6 promoter from corn is operablylinked to a sgRNA-encoding gene, in order to constitutively express thesgRNA in transformed cells. This may be desirable, for example, when theresulting sgRNA transcripts are retained in the nucleus and will thus beoptimally located within the cell to guide nuclear processes. This mayalso be desirable, for example, when the activity of the CRISPR is lowor the frequency of finding and cleaving the target site is low. It mayalso be desirable when a promoter for a specific cell type, such as thegerm line, is not known for a given species of interest. In anotherembodiment, a U3, U2, U5, or 7SL promoter is operably linked to asgRNA-encoding gene, for expression of an sgRNA in transformed cells.

In another embodiment, a chimeric promoter comprising all or a portionof any of the U6 promoters provided herein can be used to express asgRNA. Alternatively, a U3, U2, U5, or 7SL chimeric promoter comprisingall or a portion of any of these promoters, may be utilized. Forexample, the 5′ portion of the U6 promoter from corn chromosome 1 (SEQID NO:1), including one MSP element, operably linked to the 3′ portionof the U6 promoter from corn chromosome 8 (SEQ ID NO:7), including a USEelement and a TATA box (SEQ ID NO:17), cloned upstream of a sgRNA, maybe used to induce CRISPR-mediated cleavage under different environmentalconditions.

Multiple U6 promoters with differing sequence may be utilized tominimize problems in vector stability, which is typically associatedwith sequence repeats. Further, highly repetitive regions in chromosomesmay lead to genetic instability and silencing. Therefore, use ofmultiple U6 (or other disclosed) promoters in the CRISPR/Cas system oftargeted gene modification may facilitate vector stacking of multiplesgRNA cassettes in the same transformation construct, wherein thediffering sgRNA transcript levels are to be optimized for efficienttargeting of a single target site. Chimeric U6 promoters can result innew, functional versions with improved or otherwise modified expressionlevels, and four representative chimeric corn U6 promoters have beendesigned (SEQ ID NO:17, SEQ ID NO:18, SEQ ID NO:19, and SEQ ID NO:20).

The disclosed U6 promoters may also drive expression of othernon-protein-coding RNA (npcRNA). Non-limiting examples ofnon-protein-coding small RNA include a microRNA (miRNA), a miRNAprecursor, a small interfering RNA (siRNA), a small RNA (22-26 nt inlength) and precursor encoding same, a heterochromatic siRNA (hc-siRNA),a Piwi-interacting RNA (piRNA), a hairpin double strand RNA (hairpindsRNA), a trans-acting siRNA (ta-siRNA), and a naturally occurringantisense siRNA (nat-siRNA).

Promoters and transcriptional elements for additional small nuclear RNA(snRNA) genes, similar to U6 promoters and which may be transcribed byRNA polymerase II or RNA polymerase III, can also be identified, such asU3, U2, U5, and 7SL promoters. These alternate promoters can be usefulin cassette design, especially where these additional elements mayfacilitate nuclear retention of the CRISPR system transcripts.Additional gene transcription elements that can be useful in CRISPRcassette design include intron-embedded elements and transcriptionalelements of plant specific RNA polymerase IV and V promoters.

Expression Strategies for Cas-Associated Genes

The disclosure provides novel promoters for use in sequence-specific orsequence-directed CRISPR-mediated cleavage for molecular breeding byproviding transcription of, for example, a sgRNA including a spacersequence used to target a protospacer sequence within a genomic targetsite for endonuclease cleavage by at least one Cas protein, wherein thegenomic target site is native or transgenic. In addition, CRISPR systemscan be customized to catalyze cleavage at one or more genomic targetsites. In certain embodiments, such a custom CRISPR system would haveproperties making it amenable to genetic modification such that thesystem's Cas endonuclease protein(s) recognition, binding and/orcatalytic activity could be manipulated.

One aspect of this disclosure is to introduce into a plant cell anexpression vector comprising one or more cassettes encoding a U6 cornpromoter, or other disclosed promoter such as an U3, U2, U5 or 7SLpromoter, operably linked to a sgRNA, including a copy of a spacersequence complementary to a protospacer sequence within a genomic targetsite, and an expression vector encoding a Cas-associated gene to modifythe plant cell in such a way that the plant cell, or a plant comprisedof such cells, will subsequently exhibit a beneficial trait. In onenon-limiting example, the trait is a trait such as improved yield,resistance to biotic or abiotic stress, herbicide tolerance, or otherimprovements in agronomic performance. The ability to generate such aplant cell derived therefrom depends on introducing the CRISPR systemusing transformation vectors and cassettes described herein.

The expression vector encoding a Cas-associated gene may comprise apromoter. In certain embodiments, the promoter is a constitutivepromoter, a tissue specific promoter, a developmentally regulatedpromoter, or a cell cycle regulated promoter. Certain contemplatedpromoters include ones that only express in the germline or reproductivecells, among others. Such developmentally regulated promoters have theadvantage of limiting the expression of the CRISPR system to only thosecells in which DNA is inherited in subsequent generations. Therefore, aCRISPR-mediated genetic modification (i.e., chromosomal or episomaldsDNA cleavage) is limited only to cells that are involved intransmitting their genome from one generation to the next. This might beuseful if broader expression of the CRISPR system were genotoxic or hadother unwanted effects. Examples of such promoters include the promotersof genes encoding DNA ligases, recombinases, replicases, and so on.

Endonucleases are enzymes that cleave the phosphodiester bond within apolynucleotide chain. Examples of endonucleases that cleave only atspecific nucleotide sequences are well known in the art and can include,for instance, restriction endonucleases. However, the need for targetedgenome engineering as an alternative to classical plant breedingrequires highly customizable tools for genome editing. TheCRISPR-associated type II prokaryotic adaptive immune system providessuch an alternative. As such, the DNA constructs provided herein canrecognize a specific nucleotide sequence of interest within a targethost genome and allow for mutation or integration at that site. In aparticular embodiment, the DNA constructs contain one or more corn U6promoter, or chimeras thereof, that express high levels of a sequenceencoding a sgRNA. A DNA construct that expresses a sgRNA that targets aCas-associated gene product with endonuclease activity to a specificgenomic sequence, such that the specific genomic sequence is cleaved andproduces a double-stranded break which is repaired by a double strandbreak repair pathway, which may include, for example, non-homologousend-joining, homologous recombination, synthesis-dependent strandannealing (SDSA), single-strand annealing (SSA), or a combinationthereof thereby disrupting the native locus, may be particularly useful.

In one embodiment, a CRISPR system comprises at least one Cas-associatedgene encoding a CRISPR endonuclease and one sgRNA comprising a copy of aspacer sequence complementary to a protospacer sequence within anendogenous genomic target site.

In particular embodiments, a Cas-associated gene can include any type IICRISPR system endonuclease. Such a Cas-associated gene product wouldhave properties making it amenable to genetic modification such that itsnuclease activity and its recognition and binding of crRNA, tracrRNA,and/or sgRNA could be manipulated.

The present disclosure also provides for use of CRISPR-mediateddouble-stranded DNA cleavage to genetically alter expression and/oractivity of a gene or gene product of interest in a tissue- or cell-typespecific manner to improve productivity or provide another beneficialtrait, wherein the nucleic acid of interest may be endogenous ortransgenic in nature. Thus, in one embodiment, a CRISPR system isengineered to mediate disruption at specific sites in a gene ofinterest. Genes of interest include those for which altered expressionlevel/protein activity is desired. These DNA cleavage events can beeither in coding sequences or in regulatory elements within the gene.

This disclosure provides for the introduction of a type II CRISPR systeminto a cell. Exemplary type II Cas-associated genes include natural andengineered (i.e., modified, including codon-optimized) nucleotidesequences encoding polypeptides with nuclease activity such as Cas9 fromStreptococcus pyogenes, Streptococcus thermophilus, or Bradyrhizobiumsp.

The catalytically active CRISPR-associate gene (e.g., Cas9 endonuclease)can be introduced into, or produced by, a target cell. Various methodsmay be used to carry this out, as disclosed herein.

Transient Expression of CRISPRs

In some embodiments, the sgRNA and/or Cas-associated gene is transientlyintroduced into a cell. In certain embodiments, the introduced sgRNAand/or Cas-associated gene is provided in sufficient quantity to modifythe cell but does not persist after a contemplated period of time haspassed or after one or more cell divisions. In such embodiments, nofurther steps are needed to remove or segregate the sgRNA and/orCas-associated gene from the modified cell. In yet other embodiments ofthis disclosure, double-stranded DNA fragments are also transientlyintroduced into a cell along with sgRNA and/or Cas-associated gene. Insuch embodiments, the introduced double-stranded DNA fragments areprovided in sufficient quantity to modify the cell but do not persistafter a contemplated period of time has passed or after one or more celldivisions.

In another embodiment, mRNA encoding the Cas-associated gene isintroduced into a cell. In such embodiments, the mRNA is translated toproduce the type II CRISPR system endonuclease in sufficient quantity tomodify the cell (in the presence of at least one sgRNA) but does notpersist after a contemplated period of time has passed or after one ormore cell divisions. In such embodiments, no further steps are needed toremove or segregate the Cas-associated gene from the modified cell.

In one embodiment of this disclosure, a catalytically activeCas-associated gene product is prepared in vitro prior to introductionto a cell, including a prokaryotic or eukaryotic cell. The method ofpreparing a Cas-associated gene product depends on its type andproperties and would be known by one of skill in the art. For example,if the Cas-associated gene product is a large monomeric DNA nuclease,the active form of the Cas-associated gene product can be produced viabacterial expression, in vitro translation, via yeast cells, in insectcells, or by other protein production techniques described in the art.After expression, the Cas-associated gene product is isolated, refoldedif needed, purified and optionally treated to remove any purificationtags, such as a His-tag. Once crude, partially purified, or morecompletely purified Cas-associated gene products are obtained, theprotein may be introduced to, for example, a plant cell viaelectroporation, by bombardment with Cas-associated gene product coatedparticles, by chemical transfection or by some other means of transportacross a cell membrane. Methods for introducing nucleic acids intobacterial and animal cells are similarly well known in the art. Theprotein can also be delivered using nanoparticles, which can deliver acombination of active protein and nucleic acid. Once a sufficientquantity of the Cas-associated gene product is introduced so that aneffective amount of in vivo nuclease activity is present, along with theappropriate sgRNA, the protospacer sequences within the episomal orgenomic target sites are cleaved. It is also recognized that one skilledin the art might create a Cas-associated gene product that is inactivebut is activated in vivo by native processing machinery; such aCas-associated gene product is also contemplated by this disclosure.

In another embodiment, a construct that will transiently express a sgRNAand/or Cas-associated gene is created and introduced into a cell. In yetanother embodiment, the vector will produce sufficient quantities of thesgRNAs and/or Cas-associated gene in order for the desired episomal orgenomic target site or sites to be effectively modified byCRISPR-mediated cleavage. For instance, the disclosure contemplatespreparation of a vector that can be bombarded, electroporated,chemically transfected or transported by some other means across theplant cell membrane. Such a vector could have several useful properties.For instance, in one embodiment, the vector can replicate in a bacterialhost such that the vector can be produced and purified in sufficientquantities for transient expression. In another embodiment, the vectorcan encode a drug resistance gene to allow selection for the vector in ahost, or the vector can also comprise an expression cassette to providefor the expression of the sgRNA and/or Cas-associated gene in a plant.In a further embodiment, the expression cassette could contain apromoter region, a 5′ untranslated region, an optional intron to aidexpression, a multiple cloning site to allow facile introduction of asequence encoding sgRNAs and/or Cas-associated gene, and a 3′ UTR. Inparticular embodiments, the promoters in the expression cassette wouldbe U6 promoters from Zea mays In yet other embodiments, the promoterswould be chimeric U6 promoters from Zea mays. In some embodiments, itcan be beneficial to include unique restriction sites at one or at eachend of the expression cassette to allow the production and isolation ofa linear expression cassette, which can then be free of other vectorelements. The untranslated leader regions, in certain embodiments, canbe plant-derived untranslated regions. Use of an intron, which can beplant-derived, is contemplated when the expression cassette is beingtransformed or transfected into a monocot cell.

In other embodiments, one or more elements in the vector include aspacer complementary to a protospacer contained within an episomal orgenomic target site. This facilitates CRISPR-mediated modificationwithin the expression cassette, enabling removal and/or insertion ofelements such as promoters and transgenes.

In another approach, a transient expression vector may be introducedinto a cell using a bacterial or viral vector host. For example,Agrobacterium is one such bacterial vector that can be used to introducea transient expression vector into a host cell. When using a bacterial,viral or other vector host system, the transient expression vector iscontained within the host vector system. For example, if theAgrobacterium host system is used, the transient expression cassettewould be flanked by one or more T-DNA borders and cloned into a binaryvector. Many such vector systems have been identified in the art(reviewed in Hellens et al., 2000).

In embodiments whereby the sgRNA and/or Cas-associated gene istransiently introduced in sufficient quantities to modify a cell, amethod of selecting the modified cell may be employed. In one suchmethod, a second nucleic acid molecule containing a selectable marker isco-introduced with the transient sgRNA and/or Cas-associated gene. Inthis embodiment, the co-introduced marker may be part of a molecularstrategy to introduce the marker at a target site. For example, theco-introduced marker may be used to disrupt a target gene by insertingbetween genomic target sites. In another embodiment, the co-introducednucleic acid may be used to produce a visual marker protein such thattransfected cells can be cell-sorted or isolated by some other means. Inyet another embodiment, the co-introduced marker may randomly integrateor be directed via a second sgRNA:Cas-protein complex to integrate at asite independent of the primary genomic target site. In still yetanother embodiment, the co-introduced molecule may be targeted to aspecific locus via a double strand break repair pathway, which mayinclude, for example, non-homologous end-joining, homologousrecombination, synthesis-dependent strand annealing (SDSA),single-strand annealing (SSA), or a combination thereof, at the genomictarget site(s). In the above embodiments, the co-introduced marker maybe used to identify or select for cells that have likely been exposed tothe sgRNA and/or Cas-associated gene and therefore are likely to havebeen modified by the CRISPR.

Stable Expression of CRISPRs

In another embodiment, a CRISPR expression vector is stably transformedinto a cell so as to cleave a DNA sequence at or near a genomic targetsite in the host genome with a sgRNA and Cas-associated gene productencoded within the vector. In this embodiment, the design of thetransformation vector provides flexibility for when and under whatconditions the sgRNA and/or Cas-associated gene is expressed.Furthermore, the transformation vector can be designed to comprise aselectable or visible marker that will provide a means to isolate orefficiently select cell lines that contain and/or have been modified bythe CRISPR.

Cell transformation systems have been described in the art anddescriptions include a variety of transformation vectors. For example,for plant transformations, two principal methods includeAgrobacterium-mediated transformation and particle gunbombardment-mediated (i.e., biolistic) transformation. In both cases,the CRISPR is introduced via an expression cassette. The cassette maycontain one or more of the following elements: a promoter element thatcan be used to express the sgRNA and/or Cas-associated gene; a 5′untranslated region to enhance expression; an intron element to furtherenhance expression in certain cell types, such as monocot cells; amultiple-cloning site to provide convenient restriction sites forinserting the sgRNA and/or Cas-associated gene sequences and otherdesired elements; and a 3′ untranslated region to provide for efficienttermination of the expressed transcript. In particular embodiments, thepromoters in the expression cassette would be U6 promoters from Zeamays. In yet other embodiments, the promoters would be chimeric U6promoters from Zea mays.

For particle bombardment or with protoplast transformation, theexpression cassette can be an isolated linear fragment or may be part ofa larger construct that might contain bacterial replication elements,bacterial selectable markers or other elements. The sgRNA and/orCas-associated gene expression cassette(s) may be physically linked to amarker cassette or may be mixed with a second nucleic acid moleculeencoding a marker cassette. The marker cassette is comprised ofnecessary elements to express a visual or selectable marker that allowsfor efficient selection of transformed cells. In the case ofAgrobacterium-mediated transformation, the expression cassette may beadjacent to or between flanking T-DNA borders and contained within abinary vector. In another embodiment, the expression cassette may beoutside of the T-DNA. The presence of the expression cassette in a cellmay be manipulated by positive or negative selection regime(s).Furthermore, a selectable marker cassette may also be within or adjacentto the same T-DNA borders or may be somewhere else within a second T-DNAon the binary vector (e.g., a 2 T-DNA system).

In another embodiment, cells that have been modified by a CRISPR, eithertransiently or stably, are carried forward along with unmodified cells.The cells can be sub-divided into independent clonally derived lines orcan be used to regenerate independently derived plants. Individualplants or clonal populations regenerated from such cells can be used togenerate independently derived lines. At any of these stages a molecularassay can be employed to screen for cells, plants or lines that havebeen modified. Cells, plants or lines that have been modified continueto be propagated and unmodified cells, plants or lines are discarded. Inthese embodiments, the presence of an active CRISPR in a cell isessential to ensure the efficiency of the overall process.

Transformation Methods

Methods for transforming or transfecting a cell are well known in theart. Methods for plant transformation using Agrobacterium or DNA coatedparticles are well known in the art and are incorporated herein.Suitable methods for transformation of host cells for use with thecurrent disclosure are believed to include virtually any method by whichDNA can be introduced into a cell, for example by Agrobacterium-mediatedtransformation (U.S. Pat. Nos. 5,563,055; 5,591,616; 5,693,512;5,824,877; 5,981,840; and 6,384,301) and by acceleration of DNA coatedparticles (U.S. Pat. Nos. 5,015,580; 5,550,318; 5,538,880; 6,160,208;6,399,861; and 6,403,865), etc. Through the application of techniquessuch as these, the cells of virtually any species may be stablytransformed.

Various methods for selecting transformed cells have been described. Forexample, one might utilize a drug resistance marker such as a neomycinphosphotransferase protein to confer resistance to kanamycin or to use5-enolpyruvyl shikimate phosphate synthase to confer tolerance toglyphosate. In another embodiment, a carotenoid synthase is used tocreate an orange pigment that can be visually identified. These threeexemplary approaches can each be used effectively to isolate a cell orplant or tissue thereof that has been transformed and/or modified by aCRISPR.

When a nucleic acid sequence encoding a selectable or screenable markeris inserted into a genomic target site, the marker can be used to detectthe presence or absence of a CRISPR or its activity. This may be usefulonce a cell has been modified by a CRISPR, and recovery of a geneticallymodified cell that no longer contains the CRISPR, or a regenerated plantfrom such a modified cell, is desired. In other embodiments, the markermay be intentionally designed to integrate at the genomic target site,such that it can be used to follow a modified cell independently of theCRISPR. The marker can be a gene that provides a visually detectablephenotype, such as in the seed, to allow rapid identification of seedsthat carry or lack a CRISPR expression cassette.

This disclosure provides for a means to regenerate a plant from a cellwith a repaired double-stranded break within a protospacer sequence at agenomic target site. The regenerant can then be used to propagateadditional plants.

The disclosure additionally provides novel plant transformation vectorsand expression cassettes which include novel U6 promoters, and U3, U2,U5 and 7SL promoters, and combinations thereof, with CRISPR-associatedgene(s) and sgRNA expression cassettes. The disclosure further providesmethods of obtaining a plant cell, a whole plant, and a seed or embryothat have been specifically modified using CRISPR-mediated cleavage.This disclosure also relates to a novel plant cell containing aCRISPR-associated Cas endonuclease expression construct and sgRNAexpression cassettes.

Targeting Using Blunt-End Oligonucleotides

In certain embodiments, the CRISPR/Cas9 system can be utilized fortargeting insertion of a blunt-end double-stranded DNA fragment into agenomic target site of interest. CRISPR-mediated endonuclease activitycan introduce a double stand break (DSB) in the protospacer of theselected genomic target site and DNA repair, such asmicrohomology-driven non-homologous end-joining DNA repair, results ininsertion of the blunt-end double-stranded DNA fragment into the DSB.Blunt-end double-stranded DNA fragments can be designed with 1-10 bp ofmicrohomology, on both the 5′ and 3′ ends of the DNA fragment, thatcorrespond to the 5′ and 3′ flanking sequence at the cut site of theprotospacer in the genomic target site.

Use of Custom CRISPRs in Molecular Breeding

In some embodiments, genome knowledge is utilized for targeted geneticalteration of a genome. At least one sgRNA can be designed to target atleast one region of a genome to disrupt that region from the genome.This aspect of the disclosure may be especially useful for geneticalterations. The resulting plant could have a modified phenotype orother property depending on the gene or genes that have been altered.Previously characterized mutant alleles or introduced transgenes can betargeted for CRISPR-mediated modification, enabling creation of improvedmutants or transgenic lines.

In another embodiment, a gene targeted for deletion or disruption may bea transgene that was previously introduced into the target plant orcell. This has the advantage of allowing an improved version of atransgene to be introduced or by allowing disruption of a selectablemarker encoding sequence. In yet another embodiment, a gene targeted fordisruption via CRISPR is at least one transgene that was introduced onthe same vector or expression cassette as (an)other transgene(s) ofinterest, and resides at the same locus as another transgene. It isunderstood by those skilled in the art that this type of CRISPR-mediatedmodification may result in deletion or insertion of additionalsequences. Thus it may, in certain embodiments, be preferable togenerate a plurality of plants or cells in which a deletion hasoccurred, and to screen such plants or cells using standard techniquesto identify specific plants or cells that have minimal alterations intheir genomes following CRISPR-mediated modification. Such screens mayutilize genotypic and/or phenotypic information. In such embodiments, aspecific transgene may be disrupted while leaving the remainingtransgene(s) intact. This avoids having to create a new transgenic linecontaining the desired transgenes without the undesired transgene.

In another aspect, the present disclosure includes methods for insertinga DNA fragment of interest into a specific site of a plant's genome,wherein the DNA fragment of interest is from the genome of the plant oris heterologous with respect to the plant. This disclosure allows one toselect or target a particular region of the genome for nucleic acid(i.e., transgene) stacking (i.e., mega-locus). A targeted region of thegenome may thus display linkage of at least one transgene to a haplotypeof interest associated with at least one phenotypic trait, and may alsoresult in the development of a linkage block to facilitate transgenestacking and transgenic trait integration, and/or development of alinkage block while also allowing for conventional trait integration.

Use of Custom CRISPRs in Trait Integration

Directed insertion, in at least one genomic protospacer site, of DNAfragments of interest, via CRISPR-mediated cleavage allows for targetedintegration of multiple nucleic acids of interest (i.e., a trait stack)to be added to the genome of a plant in either the same site ordifferent sites. Sites for targeted integration can be selected based onknowledge of the underlying breeding value, transgene performance inthat location, underlying recombination rate in that location, existingtransgenes in that linkage block, or other factors. Once the stackedplant is assembled, it can be used as a trait donor for crosses togermplasm being advanced in a breeding pipeline or be directly advancedin the breeding pipeline.

The present disclosure includes methods for inserting at least onenucleic acid of interest into at least one site, wherein the nucleicacid of interest is from the genome of a plant, such as a QTL or allele,or is transgenic in origin. A targeted region of the genome may thusdisplay linkage of at least one transgene to a haplotype of interestassociated with at least one phenotypic trait (as described in U.S.Patent Application Publication No. 2006/0282911), development of alinkage block to facilitate transgene stacking and transgenic traitintegration, development of a linkage block to facilitate QTL orhaplotype stacking and conventional trait integration, and so on.

In another embodiment of this disclosure, multiple unique sgRNAs can beused to modify multiple alleles at specific loci within one linkageblock contained on one chromosome by making use of knowledge of genomicsequence information and the ability to design custom sgRNAs asdescribed in the art. A sgRNA that is specific for, or can be directedto, a genomic target site that is upstream of the locus containing thenon-target allele is designed or engineered as necessary. A second sgRNAthat is specific for, or can be directed to, a genomic target site thatis downstream of the target locus containing the non-target allele isalso designed or engineered. The sgRNAs may be designed such that theycomplement genomic regions where there is no homology to the non-targetlocus containing the target allele. Both sgRNAs may be introduced into acell using one of the methods described above.

The ability to execute targeted integration relies on the action of thesgRNA:Cas-protein complex and the endonuclease activity of theCas-associated gene product. This advantage provides methods forengineering plants of interest, including a plant or cell, comprising atleast one genomic modification.

A custom sgRNA can be utilized in a CRISPR system to generate at leastone trait donor to create a custom genomic modification event that isthen crossed into at least one second plant of interest, including aplant, wherein CRISPR delivery can be coupled with the sgRNA of interestto be used for genome editing. In other aspects one or more plants ofinterest are directly transformed with the CRISPR system and at leastone double-stranded DNA fragment of interest for directed insertion. Itis recognized that this method may be executed in various cell, tissue,and developmental types, including gametes of plants. It is furtheranticipated that one or more of the elements described herein may becombined with use of promoters specific to particular cells, tissues,plant parts and/or developmental stages, such as a meiosis-specificpromoter.

In addition, the disclosure contemplates the targeting of a transgenicelement already existing within a genome for deletion or disruption.This allows, for instance, an improved version of a transgene to beintroduced, or allows selectable marker removal. In yet anotherembodiment, a gene targeted for disruption via CRISPR-mediated cleavageis at least one transgene that was introduced on the same vector orexpression cassette as (an)other transgene(s) of interest, and residesat the same locus as another transgene.

In one aspect, the disclosure thus provides a method for modifying alocus of interest in a cell comprising (a) identifying at least onelocus of interest within a DNA sequence; (b) creating a modifiednucleotide sequence, in or proximal to the locus of interest, thatincludes a protospacer sequence within a genomic target site for a firstsgRNA according to the disclosure; (c) introducing into at least onecell the sgRNA and Cas-associated gene, wherein the sgRNA and/orCas-associated gene is expressed transiently or stably; (d) assaying thecell for a CRISPR-mediated modification in the DNA making up or flankingthe locus of interest; and (e) identifying the cell or a progeny cellthereof as comprising a modification in said locus of interest.

Another aspect provides a method for modifying multiple loci of interestin a cell comprising (a) identifying multiple loci of interest within agenome; (b) identifying multiple genomic protospacer sites within eachlocus of interest; (c) introducing into at least one cell multiple sgRNAand at least one Cas-associated gene according to the disclosure,wherein the cell comprises the genomic protospacer sites and the sgRNAand Cas-associated gene is expressed transiently or stably and creates amodified locus, or loci, that includes at least one CRISPR-mediatedcleavage event; (d) assaying the cell for CRISPR-mediated modificationsin the DNA making up or flanking each locus of interest; and (e)identifying a cell or a progeny cell thereof which comprises a modifiednucleotide sequence at said loci of interest.

The disclosure further contemplates sequential modification of a locusof interest, by two or more sgRNAs and Cas-associated gene(s) accordingto the disclosure. Genes or other sequences added by the action of sucha first CRISPR-mediated genomic modification may be retained, furthermodified, or removed by the action of a second CRISPR-mediated genomicmodification.

The present invention thus includes a method for modifying a locus ofinterest in a crop plant such as maize (corn; Zea mays), soybean(Glycine max), cotton (Gossypium hirsutum; Gossypium sp.), peanut(Arachis hypogaea), barley (Hordeum vulgare); oats (Avena sativa);orchard grass (Dactylis glomerata); rice (Oryza sativa, including indicaand japonica varieties); sorghum (Sorghum bicolor); sugar cane(Saccharum sp.); tall fescue (Festuca arundinacea); turfgrass species(e.g. species: Agrostis stolonifera, Poa pratensis, Stenotaphrumsecundatum); wheat (Triticum aestivum); alfalfa (Medicago sativa);members of the genus Brassica, including broccoli, cabbage, carrot,cauliflower, Chinese cabbage; cucumber, dry bean, eggplant, tobacco,fennel, garden beans, gourd, leek, lettuce, melon, okra, onion, pea,pepper, pumpkin, radish, spinach, squash, sweet corn, tomato,watermelon, ornamental plants, and other fruit, vegetable, tuber,oilseed, and root crops, wherein oilseed crops include soybean, canola,oil seed rape, oil palm, sunflower, olive, corn, cottonseed, peanut,flaxseed, safflower, and coconut.

The genome modification may comprise a modified linkage block, thelinking of two or more QTLs, disrupting linkage of two or more QTLs,gene insertion, gene replacement, gene conversion, deleting ordisrupting a gene, transgenic event selection, transgenic trait donorselection, transgene replacement, or targeted insertion of at least onenucleic acid of interest.

Definitions

The definitions and methods provided define the present disclosure andguide those of ordinary skill in the art in the practice of the presentdisclosure. Unless otherwise noted, terms are to be understood accordingto conventional usage by those of ordinary skill in the relevant art.Definitions of common terms in molecular biology may also be found inAlberts et al., Molecular Biology of The Cell, 5th Edition, GarlandScience Publishing, Inc.: New York, 2007; Rieger et al., Glossary ofGenetics: Classical and Molecular, 5th edition, Springer-Verlag: NewYork, 1991; King et al, A Dictionary of Genetics, 6th ed., OxfordUniversity Press: New York, 2247; and Lewin, Genes IX, Oxford UniversityPress: New York, 2007. The nomenclature for DNA bases as set forth at 37CFR § 1.822 is used.

As used herein, “CRISPR-associated genes” refers to nucleic acidsequences that encode polypeptide components of clustered regularlyinterspersed short palindromic repeats (CRISPR)-associated systems(Cas). Examples include, but are not limited to, Cas3 and Cas9, whichencode endonucleases from the CRISPR type I and type II systems,respectively.

As used herein, “single-guide RNA (sgRNA)” refers to a crRNA:tracrRNAfused hybrid single-stranded RNA molecule encoded by a customizable DNAelement that, generally, comprises a copy of a spacer sequence which iscomplementary to the protospacer sequence of the genomic target site,and a binding domain for an associated-Cas endonuclease of the CRISPRcomplex.

As used herein, “genomic target site” refers to a protospacer and aprotospacer adjacent motif (PAM) located in a host genome selected fortargeted mutation and/or double-strand break.

As used herein, “protospacer” refers to a short DNA sequence (12 to 40bp) that can be targeted for mutation, and/or double-strand break,mediated by enzymatic cleavage with a CRISPR system endonuclease guidedby complementary base-pairing with the spacer sequence in the crRNA orsgRNA.

As used herein, “protospacer adjacent motif (PAM)” includes a 3 to 8 bpsequence immediately adjacent to the protospacer sequence in the genomictarget site.

As used herein, “microhomology” refers to the presence of the same shortsequence (1 to 10 bp) of bases in different polynucleotide molecules.

As used herein, “codon-optimized” refers to a polynucleotide sequencethat has been modified to exploit the codon usage bias of a particularplant. The modified polynucleotide sequence still encodes the same, orsubstantially similar polypeptide as the original sequence but usescodon nucleotide triplets that are found in greater frequency in aparticular plant.

As used herein, “non-protein-coding RNA (npcRNA)” refers to a non-codingRNA (ncRNA) which is a precursor small non-protein coding RNA, or afully processed non-protein coding RNA, which are functional RNAmolecules that are not translated into a protein.

As used herein, the term “chimeric” refers to the product of the fusionof portions of two or more different polynucleotide molecules, or to agene expression element produced through the manipulation of knownelements or other polynucleotide molecules. Novel chimeric regulatoryelements can be designed or engineered by a number of methods. In oneembodiment of the present disclosure, a chimeric promoter may beproduced by fusing the 5′ portion of a U6 promoter from corn chromosome1, which includes at least one Monocot-Specific Promoter (MSP) element,to the 3′ portion of the U6 promoter from corn chromosome 8, whichincludes an Upstream Sequence Element (USE) and a TATA Box. Theresultant chimeric promoter may have novel expression propertiesrelative to the first or second promoters.

As used herein, “promoter” refers to a nucleic acid sequence locatedupstream or 5′ to a translational start codon of an open reading frame(or protein-coding region) of a gene and that is involved in recognitionand binding of RNA polymerase I, II, or III and other proteins(trans-acting transcription factors) to initiate transcription. A “plantpromoter” is a native or non-native promoter that is functional in plantcells. Constitutive promoters are functional in most or all tissues of aplant throughout plant development. Tissue-, organ- or cell-specificpromoters are expressed only or predominantly in a particular tissue,organ, or cell type, respectively. Rather than being expressed“specifically” in a given tissue, plant part, or cell type, a promotermay display “enhanced” expression, i.e., a higher level of expression,in one cell type, tissue, or plant part of the plant compared to otherparts of the plant. Temporally regulated promoters are functional onlyor predominantly during certain periods of plant development or atcertain times of day, as in the case of genes associated with circadianrhythm, for example. Inducible promoters selectively express an operablylinked DNA sequence in response to the presence of an endogenous orexogenous stimulus, for example by chemical compounds (chemicalinducers) or in response to environmental, hormonal, chemical, and/ordevelopmental signals. Inducible or regulated promoters include, forexample, promoters regulated by light, heat, stress, flooding ordrought, phytohormones, wounding, or chemicals such as ethanol,jasmonate, salicylic acid, or safeners.

As used herein, an “expression cassette” refers to a polynucleotidesequence comprising at least a first polynucleotide sequence capable ofinitiating transcription of an operably linked second polynucleotidesequence and optionally a transcription termination sequence operablylinked to the second polynucleotide sequence.

A palindromic sequence is a nucleic acid sequence that is the samewhether read 5′ to 3′ on one strand or 3′ to 5′ on the complementarystrand with which it forms a double helix. A nucleotide sequence is saidto be a palindrome if it is equal to its reverse complement. Apalindromic sequence can form a hairpin.

In some embodiments, numbers expressing quantities of ingredients,properties such as molecular weight, reaction conditions, and so forth,used to describe and claim certain embodiments of the present disclosureare to be understood as being modified in some instances by the term“about.” In some embodiments, the term “about” is used to indicate thata value includes the standard deviation of the mean for the device ormethod being employed to determine the value. In some embodiments, thenumerical parameters set forth in the written description and attachedclaims are approximations that can vary depending upon the desiredproperties sought to be obtained by a particular embodiment. In someembodiments, the numerical parameters should be construed in light ofthe number of reported significant digits and by applying ordinaryrounding techniques. Notwithstanding that the numerical ranges andparameters setting forth the broad scope of some embodiments of thepresent disclosure are approximations, the numerical values set forth inthe specific examples are reported as precisely as practicable. Thenumerical values presented in some embodiments of the present disclosuremay contain certain errors necessarily resulting from the standarddeviation found in their respective testing measurements. The recitationof ranges of values herein is merely intended to serve as a shorthandmethod of referring individually to each separate value falling withinthe range. Unless otherwise indicated herein, each individual value isincorporated into the specification as if it were individually recitedherein.

In some embodiments, the terms “a” and “an” and “the” and similarreferences used in the context of describing a particular embodiment(especially in the context of certain of the following claims) can beconstrued to cover both the singular and the plural, unless specificallynoted otherwise. In some embodiments, the term “or” as used herein,including the claims, is used to mean “and/or” unless explicitlyindicated to refer to alternatives only or the alternatives are mutuallyexclusive.

The terms “comprise,” “have” and “include” are open-ended linking verbs.Any forms or tenses of one or more of these verbs, such as “comprises,”“comprising,” “has,” “having,” “includes” and “including,” are alsoopen-ended. For example, any method that “comprises,” “has” or“includes” one or more steps is not limited to possessing only those oneor more steps and can also cover other unlisted steps. Similarly, anycomposition or device that “comprises,” “has” or “includes” one or morefeatures is not limited to possessing only those one or more featuresand can cover other unlisted features.

All methods described herein can be performed in any suitable orderunless otherwise indicated herein or otherwise clearly contradicted bycontext. The use of any and all examples, or exemplary language (e.g.,“such as”) provided with respect to certain embodiments herein isintended merely to better illuminate the present disclosure and does notpose a limitation on the scope of the present disclosure otherwiseclaimed. No language in the specification should be construed asindicating any non-claimed element essential to the practice of thepresent disclosure.

Groupings of alternative elements or embodiments of the presentdisclosure disclosed herein are not to be construed as limitations. Eachgroup member can be referred to and claimed individually or in anycombination with other members of the group or other elements foundherein. One or more members of a group can be included in, or deletedfrom, a group for reasons of convenience or patentability.

Having described the present disclosure in detail, it will be apparentthat modifications, variations, and equivalent embodiments are possiblewithout departing from the scope of the present disclosure defined inthe appended claims. Furthermore, it should be appreciated that allexamples in the present disclosure are provided as non-limitingexamples.

EXAMPLES

The following examples are included to demonstrate embodiments of thedisclosure. It should be appreciated by those of skill in the art thatmany changes can be made in the specific embodiments which are disclosedand still obtain a like or similar result without departing from theconcept, spirit and scope of the disclosure. More specifically, it willbe apparent that certain agents which are both chemically andphysiologically related may be substituted for the agents describedherein while the same or similar results would be achieved. All suchsimilar substitutes and modifications apparent to those skilled in theart are deemed to be within the spirit, scope and concept of thedisclosure as defined by the appended claims.

Example 1 Identification of Promoters to Express sgRNA

To enable genome engineering in corn, soy, and tomato using theCRISPR-based gene targeting system, novel U6 promoters native to thesethree genomes were identified. After BLAST searching for the highlyconserved U6 gene in corn, soy, and tomato genomes, 200-600 bp ofsequence upstream of these putative U6 genes was selected to test forpromoter function (Table 1). Four U6 promoters were identified from thecorn B73 genome, one each on chromosome 1 (SEQ ID NO:1), chromosome 2(SEQ ID NO:3), chromosome 3 (SEQ ID NO:5), and chromosome 8 (SEQ IDNO:7). A multiple sequence alignment of these four corn U6 promoters andcorresponding U6 genes was compiled as shown in FIGS. 1A and B. For eachof these corn U6 promoters, conserved U6 promoter motifs (e.g., TATABox, Upstream Sequence Element (USE), and Monocot-Specific Promoter(MSP) elements (Connelly, Mol. Cell Biol. 14:5910-5919, 1994) arepresent (FIG. 1B). A guanine nucleobase following the poly-T tracts wasconserved among these four genes, and may have a significant role intranscription. The sequence consensus, and percent conservation arepresented below the alignment (FIG. 1 ). Based on the multiple sequencealignment, the conserved motifs of these U6 promoters were within the140 bp proximal to the transcription start site. Based on the proximityof these conserved U6 promoter motifs, 200 bp of the proximal upstreamsequence from the transcription start site for each of the cornchromosome U6 promoters, chromosome 1 (SEQ ID NO:2), chromosome 2 (SEQID NO:4), chromosome 3 (SEQ ID NO:6), and chromosome 8 (SEQ ID NO:8) wasselected for testing for efficient promoter activity in sgRNA expressioncassettes.

In addition to the four corn U6 promoters, chimeric U6 promoters weredesigned. Four chimeric corn U6 promoters were designed using differingcombinations of the corn U6 promoters from chromosome 1, 2, and 8, witheach chimeric promoter being 397 bp in length. The breakpoints of thechimeras were determined so that the conserved elements (e.g., USE, MSP,and TATA box) of different chromosomal origins were mixed in the newchimeric U6 promoters but retained their relative spacing to the nativecorn U6 promoters. For example, the 5′ end of the U6 promoter includingMSP and USE were derived from one chromosome, while the 3′ end includingthe TATA box and one or more MSP elements were derived from a secondchromosome. Although the corn U6 promoter from chromosome 2 was not avery strong promoter in its native form, it included more than one MSPelement. Consequently, chimeras that include mainly chromosome 1 and/or8 sequence can also include one or more chromosome 2 MSP elements.Specifically, the 5′ portion of chimera 1 (SEQ ID NO:17) is derived fromthe U6 promoter from corn chromosome 1 (SEQ ID NO:1), including one MSPelement, and the 3′ portion of this chimera is derived from the U6promoter from corn chromosome 8 (SEQ ID NO:7), including a USE elementand a TATA box. Similarly, the 5′ portion of chimera 2 (SEQ ID NO:18) isderived from the U6 promoter from corn chromosome 1 (SEQ ID NO:1),including one MSP element, and the 3′ portion of this chimera is derivedfrom the U6 promoter from corn chromosome 8 (SEQ ID NO:7), including asecond MSP element, a USE element, and a TATA box. The 5′ portion ofchimera 3 (SEQ ID NO:19) is derived from the U6 promoter from cornchromosome 8 (SEQ ID NO:7), including one MSP element, and the 3′portion of this chimera is derived from the U6 promoter from cornchromosome 1 (SEQ ID NO:1), including a second MSP element, a USEelement, and a TATA box. Additionally, for chimera 3, there is a 3 bpdeletion beginning at bp 100 of SEQ ID NO:7, and the 5′ end of thechimera begins with 5′-AAG-3′. Chimera 4 (SEQ ID NO:20) was derived fromthe U6 promoter from corn chromosome 8 (SEQ ID NO:7), including the MSPelement, the USE element and the TATA box. However, this chimera alsoincludes two additional MSP elements (for a total of 3 MSP elements)derived from the U6 promoter of corn chromosomes 1 and 2.

TABLE 1 U6 promoters from corn (Zea mays), tomato(Solanum lycopersicum), and soybean(Glycine max), their chromosomal source and length. SEQ Length ID NO.Source Chromosome (bp)  1 Zea mays  1 397  2 Zea mays  1 200  3 Zea mays 2 397  4 Zea mays  2 200  5 Zea mays  3 397  6 Zea mays  3 200  7Zea mays  8 397  8 Zea mays  8 200  9 Solanum lycopersicum 10 540 10Solanum lycopersicum  1 600 11 Solanum lycopersicum  7 540 12Glycine max  6 540 13 Glycine max 16 540 14 Glycine max 19 540 15Glycine max  4 540 16 Glycine max 19 420 17 Zea mays Chimeric: 397 1 + 818 Zea mays Chimeric: 397 1 + 8 19 Zea mays Chimeric: 397 8 + 1 20Zea mays Chimeric: 397 8 + 2 + 1 + 8

Example 2 Identification of Cas9 Genes to Enable Genome Engineering inPlants

The S. pyogenes Cas9 sequence (SEQ ID NO:28 is the polypeptide sequenceof Cas9 with NLS, and SEQ ID NO:96 is the polypeptide sequence of Cas9without NLS) was used for CRISPR-mediated site-directed targeting of areporter construct in immature corn embryos. For expression, thecodon-optimized nucleotide sequence of Cas9 was designed into anexpression vector capable of expression in a plant. This Cas9 expressionvector contained a 35S promoter driving expression of the Cas9 openreading frame, a NLS sequence incorporated into the 3′ end of the Cas9coding region, and a Nos transcription termination sequence (SEQ IDNO:29).

A Cas9 protein (SEQ ID NO:26), and a monocot codon-optimized version ofthe nucleotide sequence encoding the same (SEQ ID NO:27), wereidentified from the plant-related bacteria Bradyrhizobium, and can beuseful for increasing the robustness of CRISPR/Cas-mediated genomemodification in plants. A Cas9 protein (SEQ ID NO:69) and a monocotcodon-optimized version thereof (SEQ ID NO:68), were identified fromStreptococcus thermophilus, and can be useful for increasing therobustness of CRISPR/Cas-mediated genome modification in plants.Additional Cas9 genes from plant-related bacteria (e.g., symbiotic orpathogenic bacteria) can also be identified.

Example 3 Single-Guide RNA Cassette Design

A set of single-guide RNA (sgRNA) expression cassettes were designed totarget a protospacer in a corn genomic target site referred to as Zm7(5′-GCCGGCCAGCATTTGAAACATGG-3′, SEQ ID NO:22). The different expressioncassettes included one of the 397 bp U6 promoters from corn: chromosome1 (SEQ ID NO:30), chromosome 2 (SEQ ID NO:32), chromosome 3 (SEQ IDNO:34), or chromosome 8 (SEQ ID NO:36); or one of the 200 bp U6 promoterfrom corn: chromosome 1 (SEQ ID NO:31), chromosome 2 (SEQ ID NO:33),chromosome 3 (SEQ ID NO:35), or chromosome 8 (SEQ ID NO:37). Eachexpression cassette also contained, i) the U6 poly-T terminatorconserved in each of the four corn U6 genes; ii) a sgRNA including acopy of the spacer sequence 5′-GCCGGCCAGCATTTGAAACA-3′ (SEQ ID NO:23)corresponding to the protospacer of the Zm7 genomic target site (SEQ IDNO:22); and iii) the conserved 3′ domain of a sgRNA providing the Casendonuclease binding domain, and ending with the U6 poly-T tract (SEQ IDNO:21).

Similarly, a set of sgRNA cassettes were designed with one of the fourcorn U6 397 bp promoters (SEQ ID NO:30, SEQ ID NO:32, SEQ ID NO:34, orSEQ ID NO:36; see Table 2), and the spacer sequence of the sgRNAcomplementary to the protospacer of the corn genomic target sitereferred to as Zm231 (SEQ ID NO:24). Table 3 lists the corresponding SEQID NOs for the DNA and RNA sequences of the sgRNAs containing the Zm7,Zm231, and Zm14 target sites. A negative control sgRNA cassette wasdesigned with the corn U6 397 bp promoter from corn chromosome 8 (SEQ IDNO:36) and spacer sequence of the sgRNA complementary to the protospacerof the corn genomic target site referred to as Zm14 (SEQ ID NO:24). Thisnegative control sgRNA cassette was designed with a spacer sequence ofthe sgRNA that is non-complementary to the protospacer sequence of theZm231 corn genomic target site. Inclusion of a sgRNA comprising thespacer sequence complementary to the Zm14 corn genomic target site willnot result in CRISPR/Cas-mediated cleavage of the protospacer sequenceof the Zm231 corn target protospacer site. These Zm231 and Zm14 sgRNAcassettes are represented by SEQ ID NO:38, SEQ ID NO:39, SEQ ID NO:40,SEQ ID NO:41, and SEQ ID NO:42 (Table 2). Each of these sgRNA cassettesalso contains at the 3′ end of the sgRNA sequence a U6 poly-T tract.

TABLE 2 Cassettes with the indicated corn (Zea mays)U6 promoters and sgRNA containing spacerscomplementary to the protospacer sequence ofthe indicated corn genomic target sites. U6 Promoter Genomic SEQ fromU6 Promoter target ID NO. Chromosome Length (bp) site 30 1 397 Zm7 31 1200 Zm7 32 2 397 Zm7 33 2 200 Zm7 34 3 397 Zm7 35 3 200 Zm7 36 8 397 Zm737 8 200 Zm7 38 1 397 Zm231 39 2 397 Zm231 40 3 397 Zm231 41 8 397 Zm23142 8 397 Zm14

TABLE 3 DNA and RNA sequences of Streptococcus pyogenessgRNAs containing spacer sequences complementary tothe protospacer sequence of the corn genomic targetsites Zm7, Zm231, and Zm14. SEQ ID NO. Genomic DNA RNA target site 76 79Zm7 77 80 Zm231 78 81 Zm14

Example 4 CRISPR Activity in Corn—Modified GUS Reporter Assay

To determine the activity of CRISPR/Cas-mediated gene-targetingefficiency in corn, a system for the transient expression of a reportergene in immature corn embryos was used. In addition to the sgRNAcassettes described above, the design incorporated an expressioncassette containing the Cas9 endonuclease of Streptococcus pyogenes (SEQID NO:28) containing a nuclear localization signal (NLS) sequence andwas codon-optimized for expression in corn.

The reporter gene construct for these experiments was a cassettecontaining a modified β-glucuronidase (GUS) coding sequence with a corngenomic target site (protospacer and PAM) for targeted CRISPR cleavage(e.g., the Zm7 (SEQ ID NO:22), Zm231 (SEQ ID NO:44), or Zm14 (SEQ IDNO:43)) engineered into the reporter gene and surrounded by an internaldirect repeat of the GUS coding sequence (FIG. 2 ). When co-deliveredwith expression vectors for CRISPR components, if the CRISPR systemcleaves the protospacer sequence, the endogenous plant single-strandannealing (SSA) pathway of homologous recombination DNA repair willreconstitute a functional GUS gene. These modified GUS reporterconstructs were named GU-Zm7-US, GU-Zm231-US, or GU-Zm14-US, referringto the corn genomic target site inserted into the GUS gene, Zm7, Zm231,and Zm14, respectively. One of the modified GUS reporter gene cassetteswas co-delivered with expression vectors for the other CRISPR components(e.g., one of the sgRNA cassettes) and the expression cassette encodingthe Cas9 endonuclease (SEQ ID NO:28). Expression cassettes were mixedand co-coated on 0.6 μM gold particles using standard protocols. 3-dayold pre-cultured immature corn embryos were then bombarded with theseprepared gold particles. Embryos were maintained in culture for 3-5 daysafter bombardment and then processed for histochemical staining usingX-Gluc (5-bromo-4-chloro-3-indolyl glucuronide) and standard laboratoryprotocols.

If CRISPR-mediated Cas9 endonuclease activity occurs at the protospacersite in the modified reporter gene construct, then GUS activity isdetected as blue foci using histochemical staining and X-Gluc (FIGS. 3and 4 ).

Separate expression cassettes were designed to contain one of four cornU6 promoters (from chromosomes 1, 2, 3, and 8) driving expression of asgRNA containing a spacer sequence complementary to the protospacer ofthe corn Zm7 genomic target site (FIG. 3 ). To prepare samples for theexpression assay, 0.6 μM gold particles were coated with 0.6 pmol of oneof the Zm7-sgRNA constructs and 0.3 pmol of each of the other constructs(Cas9 expression cassette and the Zm7-modified reporter construct(GU-Zm7-US)). Once the coated gold particles were prepared, ¼ of themixture was used for bombardment of 3-day old immature corn embryosusing standard protocols. More than 50 immature corn calli werebombarded for each set of constructs evaluated, and staining was done 5days post-bombardment. Following staining, photographs of representativecalli (overview of several calli and a close-up view of a single callus)were taken (FIG. 3 ). The modified reporter construct GU-Zm7-US wasdesigned to contain the Zm7 genomic target site (SEQ ID NO:22), and thesgRNA was designed to contain a copy of the Zm7 spacer (SEQ ID NO:23).The Zm7-sgRNA spacer was incorporated into expression cassettes with oneof the four 397 bp corn U6 promoters from chromosome 1 (SEQ ID NO:30),chromosome 2 (SEQ ID NO:32), chromosome 3 (SEQ ID NO:34), or chromosome8 (SEQ ID NO:36). Negative controls used in the transformation includedthe modified reporter construct GU-Zm7-US with the Zm7 genomic targetsite and: (1) lacking both the Cas9 endonuclease expression cassette andthe Zm7-sgRNA expression cassette; or (2) lacking just the Zm7-sgRNAexpression cassette (FIG. 3 ). For both of these controls no bluesectors were detected, indicating no CRISPR-mediated cleavage of themodified reporter construct had occurred. The results from evaluation ofthe four different 397 bp corn U6 promoters in driving expression of theZm7-sgRNA cassette showed that while all four 397 bp corn U6 promotersworked (i.e., blue sectors detected in the calli), the efficacy of thedifferent promoters varied (as evidenced by the size and number of bluesectors in the calli). The U6 promoter from corn chromosome 8 showed themost efficacy, followed by the U6 promoter from chromosome 1. The U6promoters from chromosomes 2 and 3 showed similar efficacy to each other(Chr 8>Chr 1>Chr2≈Chr3).

The specificity of the CRISPR/Cas9 system in this corn expression systemwas evaluated by testing mismatches between the protospacer sequencewithin the genomic target site in the modified GUUS reporter geneconstruct and the spacer sequence included in the varying sgRNAconstructs (FIG. 4 ). As in the experiment described above, 0.6 μM goldparticles were coated with one or more constructs; 0.3 pmol of theindividual modified GUUS reporter construct (GUUS target), 0.16 pmol ofthe Cas9 endonuclease expression cassette, 0.3 pmol of the individualsgRNA cassettes, and 0.03 pmol of a transformation control constructexpressing green fluorescent protein (GFP) (FIG. 4 ). Once the coatedgold particles were prepared, ¼ of the mixture was used for bombardmentof 3-day old immature corn embryos using standard protocols. More than50 immature corn calli were bombarded for each set of constructsevaluated. Tissue was maintained in culture for 3 days post-bombardment.Determination of GFP expression by fluorescence microscopy was done onday 1 and again on day 3 to validate uniform bombardment andtransformation. After the fluorescence microscopy on day 3, the calliwere processed for X-Gluc staining and fluorescent and light micrographsof representative calli were taken (FIG. 4 ). The fluorescent stainingfor all calli indicated good transformation.

Negative controls used in the transformation included the modifiedreporter construct GU-Zm231-US with the Zm231 genomic target site (1)lacking both the Cas9 endonuclease expression cassette and any sgRNAexpression cassette; or (2) having a Zm231-sgRNA expression cassettewith a corn U6 promoter from chromosome 8, but lacking the Cas9endonuclease expression cassette (FIG. 4 ). Both of these controlsshowed no blue sectors detected with X-Gluc staining, indicating noCRISPR-mediated cleavage of the modified reporter construct had occurred(FIG. 4 ).

The specificity of the CRISPR/Cas9 system was also evaluated usingcontrols including a mismatch between the protospacer site in themodified GUUS reporter construct and the sgRNA spacer sequence.Specifically, the mismatch was between the modified reporter constructGU-Zm231-US with the Zm231 genomic target site and (1) the sgRNAexpression cassette with the Zm14 spacer and a corn U6 promoter fromchromosome 8; or (2) the sgRNA expression cassette with the Zm231 spacersequence and a corn U6 promoter from chromosome 8 (FIG. 4 ).

Finally, the 397 bp corn U6 promoters (chromosome 1, 2, 3, and 8) wereeach used to generate sgRNA expression cassettes with the Zm231 genomictarget site. These were each co-transformed with the modified reporterconstruct GU-Zm231-US made with the Zm231 genomic target site. Resultsindicated that when the sgRNA spacer sequence and the genomic targetsite of the reporter construct were mismatched, there was very littleGUS activity detected. By contrast, when the sgRNA spacer sequence andthe genomic target site of the reporter construct were matched, manylarge blue foci were detected (FIG. 4 ). The U6 promoter from cornchromosome 8 may have higher efficacy (based on the assumption thatefficacy correlates to blue foci which were more numerous, larger insize, and darker in staining intensity), followed by the U6 promoterfrom corn chromosome 1. The U6 promoters from corn chromosomes 2 and 3showed similar efficacy to each other (Chr 8>Chr 1>Chr2≈Chr3).

The sgRNA driven by the U6 promoter from corn chromosome 8 consistentlyshowed high activity. These findings suggest that different corn U6promoters have differing activities, and further highlights theusefulness of the U6 promoter derived from corn chromosome 8 in theCRISPR/Cas system of targeted genome modification.

Example 5 Blunt-End Oligonucleotide Integration

The CRISPR/Cas9 system was evaluated for targeting efficacy of insertionof a blunt-end double-stranded DNA fragment into one of three genomictarget sites, identified as Zm_L70a (SEQ ID NO:47), Zm_L70c (SEQ IDNO:59), and Zm_L70d (SEQ ID NO:61) within the corn genome. Each of thesethree genomic target sites is unique in the corn genome. If the CRISPRcomponents are capable of endonuclease activity and introduce a doublestrand break (DSB) in the protospacer of the selected genomic targetsite, then the endogenous corn non-homologous end-joining DNA repairsystem will insert the blunt-end double-stranded DNA fragment into theDSB.

Complementary oligonucleotides were pre-annealed to form blunt-endeddouble-stranded DNA fragments, and these were co-transformed with CRISPRconstructs into corn protoplasts (FIG. 5A). The oligonucleotide pairswere designed to either (1) not contain microhomology regions (see FIG.5B), or (2) contain on each end (5′ and 3′) a 3 bp microhomology to thecorresponding 5′ and 3′ flanking sequence at the cut site of theprotospacer in the genomic target site (FIG. 5C). The microhomologysequences may promote blunt-end double-strand DNA fragment integrationsthrough a mechanism of microhomology-driven non-homologous end-joiningat the genomic target site. The two sequences of the oligonucleotidepair without microhomology sequence were SEQ ID NO:45 and SEQ ID NO:46.The three pairs of oligonucleotides, each containing microhomology totheir respective genomic target site, were annealed in pairwisecombinations of the following oligonucleotides: (1) SEQ ID NO:62 and SEQID NO:63 (microhomology to Zm_L70a); (2) SEQ ID NO:64 and SEQ ID NO:65(microhomology to Zm_L70c); and (3) SEQ ID NO:66 and SEQ ID NO:67(microhomology to Zm_L70d) to form blunt-end double-strand DNAfragments.

For these blunt-end double-strand DNA fragment integration assays, theCRISPR constructs used included the Cas9 endonuclease expressioncassette described above, and one of three sgRNA expression cassettes.The three sgRNA expression cassettes were each driven by the 397 bpversion of the U6 promoter from corn chromosome 8 (SEQ ID NO:7) andcontained the spacer sequence corresponding to the genomic target sites:Zm_L70a (SEQ ID NO:48), Zm_L70c (SEQ ID NO:58), and Zm_L70d (SEQ IDNO:60). Differing combinations of the CRISPR components andoligonucleotides for these assays were mixed as follows: 0.6 pmol of theCas9 expression cassette, 1.6 pmol of one of the sgRNA expressioncassettes, and 35 pmol of the pre-annealed, oligonucleotide pair, and,using a standard PEG-mediated protocol, transformed into aliquots ofcorn leaf protoplast suspensions containing about 320,000 cells. Twodays later, corn protoplasts were harvested and analyzed for insertionof the blunt-end double-strand DNA fragment into the particular L70genomic target site targeted by the unique sgRNA selected in each case(Table 4). The negative control was the omission of the Cas9 expressioncassette during the corn protoplast transformation.

To detect the insertion of the blunt-end double strand DNA fragment intothe corn chromosome, DNA was extracted and high-throughput thermalamplification (PCR) was done with multiple pairs of primers (Table 5).As the blunt-end double strand DNA fragment may insert into the CRISPRcleaved chromosomal DNA in either orientation, primers were designed toone strand of the blunt-end double strand DNA fragment and to bothflanking genomic regions, with each primer pair spanning the junction ofthe insertion site. The PCR amplicons were separated on a fragmentanalysis platform (ABI3730 DNA analyzer) from Life Technologies (GrandIsland, N.Y.). This platform, which is more sensitive than gel-basedelectrophoresis methods and has single-bp resolution, confirmed whetherthe amplicons originated from the template of interest and whether theywere specific to the experimental treatment conditions.

TABLE 4 DNA and RNA sequences of Streptococcus pyogenessgRNA containing spacer sequences complementary tothe protospacer sequence of the corn genomictarget sites L70a, L70c, L70d. SEQ ID NO. Genomic DNA RNA target site 8285 Zm_L70a 83 86 Zm_L70c 84 87 Zm_L70d

One representative fragment analysis profile is shown in FIG. 5D(Experiment T3, Table 5). Amplification of DNA extracted from cornprotoplasts transformed with Cas9, sgRNA containing spacer sequencescomplementary to the protospacer sequence of the Zm_L70c corn genomictarget site (SEQ ID NO:83), and the blunt-end double-stranded DNAfragment without microhomology, using primers at the Zm_L70c genomictarget site (SEQ ID NO:49, primer specific for the inserted blunt-enddouble-strand DNA fragment, and SEQ ID NO:55, primer specific forflanking genomic DNA) revealed a major peak of the expected size andseveral additional peaks of similar sizes (arrow) (FIG. 5D, top panel).By contrast, no amplification products were seen from DNA extracted fromthe negative control transformations (FIG. 5D, bottom panel). This PCRprofile was consistent with double-stranded breaks repaired erroneouslyby non-homologous end-joining, resulting in introduction of short indelsat the site of repair.

To confirm that the blunt-end double-strand DNA fragment wasincorporated at the genomic target site, the PCR amplicons were clonedand sequenced (Table 5). Negative controls lacking Cas9 proteins did notproduce PCR products. Seven of the ten experiments showed the expectedpattern: a positive PCR product of the expected size for the testsamples, and no PCR product for control samples. The seven experimentsshowing a positive PCR product included experiments demonstratingintegrations occurring for both blunt-end double-strand DNA fragmentswith and without microhomology. Experiments T1 and T7 failed to detecttargeted integrations in either test or control samples. PCR productsfrom six of the experiments were cloned and sequenced, confirming theexpected DNA fragment-chromosome junctions for blunt-end double-strandDNA fragment integration. Sequencing results showed the presence of bothfull-length and truncated DNA fragments (indels) present at the site ofblunt-end double-strand DNA fragment integration (see, e.g., FIG. 5E,Experiment T1). Sequences were consistent with the fragment analysis(FIG. 5D) and demonstrated that CRISPR/Cas9 can target native,sequence-specific, chromosomal loci for cleavage in corn protoplasts.These results also demonstrated successful blunt-end double-strand DNAfragment integration with and without regions of microhomology.

TABLE 5 Blunt-end oligonucleotide insertion assay. Expected GenomicPrimer pairs amplicon Expected Sequenced Experiment Treatmentsprotospacer/target site Microhomology Orientation (SEQ ID NOs) size (bp)amplicon amplicon T1 test L70a − + 50/49 408 − − (−) control L70a − +50/49 N/A − − T2 test L70a − − 51/49 324 + + (−) control L70a − − 51/49N/A − − T3 test L70c − + 55/49 384 + + (−) control L70c − + 55/49 N/A −− T4 test L70c − − 54/49 411 + + (−) control L70c − − 54/49 N/A − − T5test L70c + + 55/49 384 + + (−) control L70c + + 55/49 N/A − − T6 testL70c + − 54/49 411 + − (−) control L70c + − 54/49 N/A − − T7 test L70d− + 56/49 359 − − (−) control L70d − + 56/49 N/A − − T8 test L70d − −57/49 356 + + (−) control L70d − − 57/49 N/A − − T9 test L70d + + 56/49359 + + (−) control L70d + + 56/49 N/A − − T10 test L70d + − 57/49 356 +− (−) control L70d + − 57/49 N/A * − Where * = sample contaminated.

Example 6 Targeted Genome Modification with CRISPR/Cas9 Complex Genesfrom Streptococcus thermophilus

It may be desirable to accomplish CRISPR-mediated genome modification ofsome plants (e.g., crop plants) with CRISPR complex genes derived fromStreptococcus thermophilus instead of S. pyogenes. The inventors havedeveloped an expression cassette encoding a codon-optimized nucleotidesequence with two nuclear localization signals (NLS) (SEQ ID NO:136) ofthe Cas9 protein from S. thermophilus (SEQ ID NO:69). The StCas9 wasdesigned to encode both an N-terminal and a C-terminalnuclear-localization signal (NLS) (SEQ ID NO:120) at amino acid position2-11 and 1133-1142 (SEQ ID NO:135). Additionally, the DNA expressioncassette (SEQ ID NO:136) included an intron at nucleotide position507-695. A series of unique S. thermophilus single-guide RNAs (sgRNA)have been designed. The S. thermophilus sgRNA was designed to link thenative S. thermophilus crRNA and tracrRNA with a stem loop(5′-CCAAAAGG-3′; SEQ ID NO:105), and to contain the spacer sequencecomplementary to the protospacer of the corn genomic target sitesselected from Zm_L70e (SEQ ID NO:72), Zm_L70f (SEQ ID:73), Zm_L70g (SEQID NO:74), or Zm_L70h (SEQ ID NO:75). The seven nucleotides at the 3′end of each of these genomic target sites represent the S.thermophilus-specific protospacer adjacent motif (PAM, 5′-NNAGAAW-3′;SEQ ID NO:106). FIG. 6 shows the predicted secondary structure of thisS. thermophilus sgRNA (SEQ ID NO:70) with a copy of the spacer sequence(SEQ ID NO:71) complementary to the protospacer sequence of the cornZm_L70h genomic target site (SEQ ID NO:75) and stem-loop linker(5′-CCAAAAGG-3′; SEQ ID NO:105). Table 6 lists the corresponding SEQ IDNOs for the DNA and RNA sequences encoding S. thermophilus sgRNAscontaining spacer sequences complementary to the protospacer sequence ofthe corn genomic target sites Zm_L70e, Zm_L70f, Zm_L70g, and Zm_L70h.

TABLE 6 DNA and RNA sequences of Streptococcusthermophilus sgRNA containing spacersequences complementary to the protospacersequence of the corn genomic target sitesZm_L70e, Zm_L70f, Zm_L70g, and Zm_L70h. SEQ ID NO. Genomic DNA RNAtarget site 107 111 Zm_L70e 108 112 Zm_L70f 109 113 Zm_L70g 110 114Zm_L70h

The assay for S. thermophilus Cas9 mediated genome modification wasessentially as described in example 5. Specifically, 320,000 cornprotoplasts were transfected with 0.8 pmol S. thermophilus Cas9 (SEQ IDNO:136) expression construct, and 1.6 pmol of one of the sgRNAexpression constructs driven by the 397 bp version of the U6 promoterfrom corn chromosome 8 (SEQ ID NO:7) containing the spacer sequencecorresponding to the genomic target sites: sgRNA construct for site L70e(SEQ ID NO:107), sgRNA construct for site L70f (SEQ ID NO:108, and sgRNAconstruct for site L70g (SEQ ID NO:109), and 50 pmol of a pre-annealedblunt-end double-strand DNA fragment encoded by SEQ ID NO:115 and SEQ IDNO:116. To test for transformation efficiency, 2.5 ug of a constructencoding green fluorescent protein (GFP) was included. At the time ofharvesting, an aliquot of the transfected protoplasts was collected tocalculate transfection frequency on the PE Operetta® Imaging System(PerkinElmer, Waltham, Mass.) which calculates the ratio of GFP positivecells per total cells. Omission of the StCas9 expression cassette duringthe corn protoplast transformation served as the negative control.Protoplasts were harvested 48 hours post transfection and analyzed forinsertion of the blunt-end double-strand DNA fragment into the L70e, orL70f, or L70g genomic target site by quantitative, high-throughput PCRanalysis using a BioRad QX200™ Droplet Digital™ PCR (ddPCR™) system(BioRad, Hercules, Calif.) and TaqMan® probes. To determine the percenttargeted integration rate, one set of TaqMan primers and probes was usedwith the ddPCR system to detect the template copy number of a junctionof the inserted blunt-end double-strand DNA fragment at the chromosomaltarget site. The junction specific primers and probe for cornchromosomal sites L70e, L70f, L70g, and L70h are indicated in Table 7.To normalize the amount of DNA in the transfected protoplast aliquot,the ddPCR system was used with a second set of TaqMan primers and aprobe (primers encoded by SEQ ID NO:132 and SEQ ID NO:134; probe encodedby SEQ ID NO:133) to determine the template copy number of a site uniquein the corn genome and outside of the target site. The calculation forthe percent targeted integration rate was the target site specifictemplate copy number divided by the corn genome specific template copynumber divided by the transformation frequency as determined byGFP-positive vs. total cell counts using the PE Operetta® Imaging System(PerkinElmer, Waltham, Mass.). The data points presented in the graphwere determined by averaging four biological replicates. The results arepresented in FIG. 15 and show that the percent integration rate for eachof the sites L70e, L70f, and L70g was higher than the correspondingcontrol.

PCR amplicons corresponding to targeted junctions from the protoplastexperiments were sequenced to confirm the integration of the blunt-enddouble-strand DNA fragments into the selected target sites. FIG. 15Bshows an alignment of the expected integration of the blunt-end,double-strand DNA fragment at the L70f target site (SEQ ID NO:144) andone example of target site integration (SEQ ID NO:145) with deletion ofsome of the sequence of the DNA fragment. Although these sequencingresults show indels, the results confirm that the DNA fragment wasintegrated at the L70f target site.

TABLE 7 SEQ ID NOs for primers and probes for PCRamplification of junction at corn chromosomaltarget sites with inserted DNA fragment. SEQ ID NO: of SEQ IDSEQ ID NO: of Genomic specific NO: of Inserted DNA Site primer Probespecific primer L70e 139 138 137 L70f 140 138 137 L70g 141 138 137 L70h142 138 137

Example 7 Targeting Multiple Unique Genomic Sites by sgRNA Multiplexing

A key advantage of the CRISPR system, as compared to other genomeengineering platforms, is that multiple sgRNAs directed to separate andunique genomic target sites can be delivered as individual components toeffect targeting. Alternatively, multiple sgRNAs directed to separateand unique genomic target sites can be multiplexed (i.e., stacked) in asingle expression vector to effect targeting. An example of anapplication that can require multiple targeted endonucleolytic cleavagesincludes marker-gene removal from a transgenic event (FIG. 7A). TheCRISPR system can be used to remove the selectable marker from thetransgenic insert, leaving behind the gene of interest.

Another example of an application in which such a CRISPR/Cas system canbe useful is when there is a requirement for multiple targetedendonucleolytic cleavages, such as when the identification of causalgenes behind a quantitative trait is hampered by lack of meioticrecombinations in the QTL regions that would separate the genecandidates from each other. This can be circumvented by transformationwith several CRISPR constructs targeting the genes of interestssimultaneously. These constructs would either knock out the genecandidates by frame shift mutations or remove them by deletion. Suchtransformations can also lead to random combinations of intact andmutant loci that would allow for identification of casual genes (FIG.7B).

Example 8 Integration Rates as a Function of Blunt-End DNA FragmentConcentration and Time

The corn protoplast system essentially as described in Example 5 wasused to determine the optimal concentration of blunt-end double-strandDNA fragment to be included in the assay mixture to achieve the highestpercentage targeting integration rate. For these assays the expressionconstruct encoding the S. pyogenes Cas9 was modified to include anintron from position 469-657 in the coding region (SEQ ID NO:119).Additionally, the protein sequence (SEQ ID NO:118) contained two NLSsequences (SEQ ID NO:120), one at the amino-terminal end (amino acids 2to 11 of SEQ ID NO:118) and one at the carboxy-terminal end (amino acids1379 to 1388 of SEQ ID NO:118).

For the assay, 320,000 corn protoplasts were transfected with 0.8 pmolS. pyogenes Cas9 (SEQ ID NO:119) expression construct, and 1.6 pmol ofsgRNA expression construct driven by the 397 bp version of the U6promoter from corn chromosome 8 (SEQ ID NO:7) containing the spacersequence corresponding to the genomic target sites: Zm7 (SEQ ID NO:23),and a pre-annealed blunt-end double-strand DNA fragment (SEQ ID NO:115and SEQ ID NO:116) at 1, 5, 10, 25, 50, and 100 pmol. For transformationefficiency, 2.5 ug of a construct encoding green fluorescent protein(GFP) was included and the number of GFP positive protoplasts per320,000 corn protoplasts was determined. Omission of the Cas9 expressioncassette during the corn protoplast transformation served as thenegative control. Protoplasts were harvested at 24 hours and 48 hourspost transfection and analyzed for insertion of the blunt-enddouble-strand DNA fragment into the Zm7 genomic target site byquantitative, high-throughput PCR analysis using a BioRad QX200™ DropletDigital™ PCR (ddPCR™) system (BioRad, Hercules, Calif.) and Taqman®probes. To determine the percent targeted integration rate, one set ofTaqman primers (represented by SEQ ID NO:137 and SEQ ID NO:143) and aprobe (represented by SEQ ID NO:138) was used with the ddPCR system todetect the template copy number of a junction of the inserted blunt-enddouble-strand DNA fragment at the chromosomal Zm7 target site. Tonormalize the amount of DNA in the transfected protoplast aliquot, theddPCR system was used with a second set of Taqman primers and a probe(primers encoded by SEQ ID NO:132 and SEQ ID NO:134; probe encoded bySEQ ID NO:133) to determine the template copy number of a site unique inthe corn genome and outside of the target site. The calculation for thepercent targeted integration rate was the target site specific templatecopy number divided by the corn genome specific template copy numberdivided by the transformation frequency as determined by GFP-positivevs. total cell counts using the PE Operetta Imaging System (PerkinElmer,Waltham, Mass.). The data points presented in the graph were determinedby averaging four biological replicates. The results are presented inFIG. 8 and show that the peak for percentage targeted integration ratewas obtained with 50 pmol of the blunt-end, double-strand DNA fragmentand incubation for 48 hours.

Example 9 Integration Rates as a Function of Cas9 EndonucleaseConcentration

The corn protoplast system essentially as described in Example 8 wasused to establish the optimal concentration of expression constructsencoding S. pyogenes Cas9 included in the protoplast transfectionmixture to achieve the highest percentage targeted integration rate withthe blunt-end double-strand DNA fragments. For these assays theexpression construct encoding the modified S. pyogenes Cas9 was asdescribed in Example 8. For the assay, 320,000 corn protoplasts weretransfected with 0.1 pmol or 0.4 pmol or 0.8 pmol or 1.6 pmol of the S.pyogenes Cas9 (SEQ ID NO:119) expression construct, and 1.6 pmol ofsgRNA expression construct driven by the 397 bp version of the U6promoter from corn chromosome 8 (SEQ ID NO:7) containing the spacersequence corresponding to the genomic target site Zm7 (SEQ ID NO:23), 50pmol of pre-annealed blunt-end double-strand DNA fragment (SEQ ID NO:115and SEQ ID NO:116), and a construct encoding GFP. The corn protoplastswere harvested 48 hours post-transfection and the percentage targetedintegration was assessed as described in Example 8 using the ddPCRsystem and Taqman probes. The results of the analysis of the Cas9expression construct titration are presented in FIG. 9 showing a linearincrease in percentage targeted integration rate over the full-range ofpmol of expression construct concentration tested.

Example 10 Sequence Confirmation of Insertion of Blunt-End Double-StrandDNA Fragments

PCR amplicons corresponding to targeted junctions from the protoplastexperiments detailed in Example 5 and Example 8 were sequenced toconfirm the integration of the blunt-end double-strand DNA fragmentsinto the selected target site, Zm7 or L70c.

For the corn chromosome site Zm7 targeted by CRISPR/Cas9 constructs andwith blunt-end double-strand DNA fragment formed by annealedoligonucleotides encoded by SEQ ID NO:115 and SEQ ID NO:116 (see Example8), PCR amplicons were agarose-gel purified and sequenced. The expectedsequence is presented as SEQ ID NO:123, as shown in FIG. 10A. Theresults from the sequencing show at least one event with a base-pairperfect insertion of the blunt-end double-strand DNA fragment into thetarget site (SEQ ID NO:124). The results also show events with shortdeletions in either the chromosome or the DNA insert side of thejunction, as indicated with SEQ ID NO:125 (see FIG. 10A).

For the corn chromosome site L70c targeted by CRISPR/Cas9 constructs andwith blunt-end double-strand DNA fragment without micro-homologysequences formed by annealed oligonucleotides encoded by SEQ ID NO:45and SEQ ID NO:46 (see Example 5), PCR amplicons were agarose-gelpurified and sequenced. The expected sequence is presented as SEQ IDNO:126, as shown in FIG. 10B. The results from the sequencing show atleast one event that was detected with a base-pair perfect insertion ofthe blunt-end double-strand DNA fragment into the target site (SEQ IDNO:127). The results also show an example of events with short deletionsin either the chromosome or the DNA insert side of the junction, asindicated with SEQ ID NO:128 (see FIG. 10B).

For the corn chromosome site L70c targeted by CRISPR/Cas9 constructs andwith blunt-end double-strand DNA fragment with 3 bp micro-homologysequences at each end of the DNA fragment formed by annealedoligonucleotides encoded by SEQ ID NO:121 and SEQ ID NO:122 (see Example5), PCR amplicons were agarose-gel purified and sequenced. The expectedsequence is presented as SEQ ID NO:129, as shown in FIG. 10C. Theresults from the sequencing show at least one event that was detectedwith a base-pair perfect insertion at the junction of the blunt-enddouble-strand DNA fragment into the target site (SEQ ID NO:130). Theresults also show an example of events with short deletions in eitherthe chromosome or the DNA insert side of the junction (SEQ ID NO:131)and/or in the DNA insert itself (SEQ ID NO:130 and SEQ ID NO:131), asindicated (see FIG. 10C).

These results indicate that blunt-end double-strand DNA fragments areincorporated into a double-strand break (DSB) at a target site createdby a CRISPR/Cas9 system. The DNA fragments are incorporated bynon-homologous end joining (NHEJ), an error-prone DNA repair mechanismthat heals most somatic double-strand breaks in nature. Consistent withthe endogenous NHEJ repair mechanism, the results show that blunt-enddouble-strand DNA fragments were incorporated with short deletions atthe DSB created with CRISPR/Cas9 components, as illustrated by comparingSEQ ID NO:123 and SEQ ID NO:125 (FIG. 10A), and by comparing SEQ IDNO:126 and SEQ ID NO:128 (FIG. 10B), and by comparing SEQ ID NO:129 andSEQ ID NO:131 (FIG. 10C) (with this last pair there was also a 2 bpdeletion internal to the inserted DNA fragment). Blunt-end double-strandDNA fragments were incorporated in a base-pair perfect manner at the DSBcreated with CRISPR/Cas9 components, as illustrated by comparing SEQ IDNO:123 and SEQ ID NO:124 (FIG. 10A), and by comparing SEQ ID NO:126 andSEQ ID NO:127 (FIG. 10B), and by comparing SEQ ID NO:129 and SEQ IDNO:130 (FIG. 10C) (though in this last pair there was a 2 bp deletioninternal to the inserted DNA fragment).

Example 11 Integration Rates as a Function of TALEN EndonucleaseConcentration

The corn protoplast system essentially as described in Example 8 wasused to establish the optimal concentration of expression constructsencoding a pair of TALEN endonucleases needed in the transfectionmixture to achieve the highest percentage targeting integration rate ofblunt-end double-strand DNA fragments.

For these assays a pair of expression constructs with TALEN encodingcassettes was tested. The targeting site in the corn chromosome for theTALEN pair was L70.4. For the TALEN assay 0, 0.01, 0.02, 0.05, 0.1, 0.2and 0.4 pmol of each of the constructs containing the TALEN encodingcassettes was used in the corn protoplast transformation. Also includedwas 50 pmol of pre-annealed blunt-end double-strand DNA fragment (SEQ IDNO:115 and SEQ ID NO:116) and 2.5 ug of the GFP encoding construct. Thecorn protoplasts were harvested 48 hours post-transfection and thepercentage targeted integration was assessed by high-throughput PCRanalysis essentially as described in previous examples. The results ofthe analysis of the TALEN expression construct titration are presentedin FIG. 11 showing that the percentage targeted integration rateplateaus at about 0.1 pmol of each of the TALEN expression constructsincluded in the transfection reaction.

Example 12 Targeted Integration by Homologous Recombination—CRISPR/Cas9

Genome modification by targeted integration of a desired introduced DNAsequence will occur at sites of double strand breaks (DSB) in achromosome. The integration of the DNA sequence is mediated bymechanisms of non-homologous end-joining (NHEJ) or homologousrecombination using DNA repair mechanisms of the host cell. DSBs atspecific sites in the host cell genome can be achieved using anendonuclease such as an engineered meganuclease, an engineered TALEN ora CRISPR/Cas9 system.

A schematic representation of a high through-put (HTP) testing method ofNHEJ and HR-mediated targeted integration is presented in FIG. 12 .Targeted integration of a DNA fragment by non-homologous end-joining(NHEJ) is presented in FIG. 12A and targeted integration of a DNAfragment by homologous recombination (HR) is presented in FIG. 12B. ForHR, a recombinant DNA construct containing a cassette with the DNAfragment flanked with left- and right-homology arms (Left-HA andRight-HA, respectively) is introduced into the host cell. Followingeither NHEJ or HR targeted integration, HTP PCR analysis with primers(indicated by the short pair of arrows in FIGS. 12A and 12B) designed todetect a targeted event where one primer is internal to the inserted DNAfragment and a second primer is located in the flanking chromosomalregion.

The corn protoplast system as described in the above examples was usedto determine homologous recombination (HR) mediated targeted integrationrates. The target site Zm7 was targeted by a CRISPR/Cas9 nuclease andthe sgRNA for targeting the corn Zm7 site, as described in Example 8. Inaddition to the constructs encoding the CRISPR/Cas9 and sgRNA cassettes,a construct containing a cassette for homologous recombination cassettewas included at either 4 ug concentration or 6 ug concentration. Asdescribed above, a construct encoding GFP was also transfected and thepercentage of GFP positive cells was used in the calculation of thetargeted integration rate. The controls did not contain the constructencoding the SpCas9 endonuclease.

The recombinant DNA constructs containing cassettes for homologousrecombination were designed to have the 90 bp sequence corresponding tothe 90 bp blunt-end, double-strand DNA fragment used for NHEJ assays(encoded by sequences SEQ ID NO:115 and SEQ ID NO:116) flanked by leftand right homology arms (HA). The left-HA is designed based on thesequence flanking the 5′-side of the site for the double-strand break(DSB) for targeted integration. The right-HA is designed as thesequencing flanking the 3′-side of the site for the double-strand break(DSB) for targeted integration. For the Zm7 site the left-HA was 240 bpin length, and two separate right-HA sequences were included, one of 230bp and one of 1003 bp in length (see FIGS. 13A and 13B, respectively).

Protoplasts were transfected and harvested 48 hours later and analyzedfor integration by high through-put PCR with one primer designed for theregion of the DNA fragment sequence (encoded by the sequences SEQ IDNO:115 and SEQ ID NO:116) and one primer in the chromosomal regionflanking the left homology arm. The size of the expected PCR ampliconwith successful HR using the Zm7 targeting constructs (FIGS. 13A and13B) was 411 bp. In conventional quantitative PCR (qPCR), ampliconslonger than about 160 bp cannot be quantitatively measured, and thus,are not recommended to be used. The current experiment clearlydemonstrated that significantly longer PCR amplicons can also be used inthe ddPCR system, which opens up a host of new opportunities inquantitative biology.

The HR-mediated recombination rate for the corn chromosomal site Zm7 arepresented in Table 8 and FIG. 15 . When the left-HA and the right-HAwere 240 bp and 230 bp, respectively, and the construct with thehomology arm cassette was at a concentration of 4 ug or 6 ug, there wasnot a statistically significant difference in the percentage integrationrate between the test sample and the control. When the left-HA was 240bp and the right-HA was 1003 bp (indicated by SL in Table 8), and theconstruct with the homology arm cassette was at a concentration of 4 ugthere was not a statistically significant difference in the percentageintegration rate between the test sample and the control. In contrast,when the left-HA was 240 bp and the right-HA was 1003 bp (indicated bySL in Table 8), and the construct with the homology arm cassette was ata concentration of 6 ug there was a statistically significant (p<0.05)difference in the percentage integration rate between the test sampleand the control. This result shows that targeted integration can beachieved by the mechanism of HR at sites of DSB which are targeted byCRISPR/Cas9 system in a corn genome.

TABLE 8 HR-mediated integration rates in corn protoplasts with DSBmediated by a CRISPR/ Cas9 system at the chromosomal site Zm7. Mean StdDev Test Control Test Control Zm7 + SS +4 ug 0.88346 0.15936 0.839990.17658 Zm7 + SS +6 ugl 1.20057 0.15936 0.92889 0.17658 Zm7 + SL +4 ug1.297183 0.98692 0.791837 0.86133 Zm7 + SL +6 ug** 2.32094 0.986921.35951 0.86133 **Test was statistically higher (p < 0.05) than thecorresponding control based on a student’s t-test.

Example 13 Targeted Integration by Homologous Recombination—TALEN

The corn protoplast system as described in the above examples was usedto determine homologous recombination (HR) mediated targeted integrationrates. The target site L70.4 was targeted by a pair of recombinant DNAconstructs encoding a TALEN pair directed to target the corn L70.4 site,as described in Example 11. In addition to the constructs encoding theTALEN cassettes, a construct containing a cassette for homologousrecombination cassette was included at either 4 ug concentration or 6 ugconcentration. As described above, a construct encoding GFP was alsotransfected and the percentage of GFP positive cells was used in thecalculation of the targeted integration rate. The controls did notcontain the constructs encoding the TALENs.

The recombinant DNA constructs containing cassettes for homologousrecombination were designed to have the 90 bp sequence corresponding tothe 90 bp blunt-end, double-strand DNA fragment used for NHEJ assays(encoded by sequences SEQ ID NO:115 and SEQ ID NO:116) flanked by leftand right homology arms (HA). The left-HA is designed based on thesequence flanking the 5′-side of the site for the double-strand break(DSB) for targeted integration. The right-HA is designed as thesequencing flanking the 3′-side of the site for the double-strand break(DSB) for targeted integration. For the L70.4 site the right-HA was 230bp in length, and two separate left-HA sequences were included, one of230 bp and one of 1027 bp in length (see FIGS. 14A and 14B,respectively).

Protoplasts were transfected and harvested 48 hours later and analyzedfor integration by quantitative, high through-put PCR using the ddPCRsystem and Taqman probes with one primer designed for the region of theDNA fragment sequence (encoded by the sequences SEQ ID NO:115 and SEQ IDNO:116) and one primer in the chromosomal region flanking the lefthomologous arm. The size of the expected PCR amplicon with successful HRusing the L70.4 targeting construct of FIG. 14A was 383 bp. The size ofthe expected PCR amplicon with successful HR using the L70.4 targetingconstruct of FIG. 14B was 1208 bp.

The HR-mediated recombination rate for the corn chromosomal site L70.4with two separate template DNA constructs is presented in Table 9. Whenthe left-HA and the right-HA were both 230 bp (indicated by SS in Table9), and the construct with the homology arm cassette was at aconcentration of 4 ug there was a statistically significant (p<0.05)difference in the percentage integration rate between the test sampleand the control. When the left-HA and the right-HA were both 230 bp(indicated by SS in Table 9), and the construct with the homology armcassette was at a concentration of 6 ug there was not a statisticallysignificant difference in the percentage integration rate between thetest sample and the control. When the left-HA was 1027 bp and theright-HA was 230 bp (indicated by LS in Table 9), and the construct withthe homology arm cassette was at a concentration of 4 ug or 6 ug therewas not a statistically significant difference in the percentageintegration rate between the test sample and the control. This resultshows that targeted integration can be achieved by the mechanism of HRat sites of DSB which are targeted by TALENs directed to a specific sitein a corn genome.

TABLE 9 HR-mediated Integration Rates in corn protoplasts with DSBmediated by TALENs at the chromosomal site L70.4. Mean Std Dev TestControl Test Control L70.4+SS+4 ug** 1.54833 0.12181 1.48997 0.14504L70.4+SS+6 ug 0.28395 0.12181 0.20174 0.14504 L70.4+LS+4 ug 0.1633470.38048 0.282926 0.67502 L70.4+LS+6 ug 0.51467 0.38048 0.23052 0.67502**Test was statistically higher (p < 0.05) than the correspondingcontrol based on a student’s t-test.

Example 14 Targeting in Corn Genome with Chimeric U6 Promoters

Chimeric U6 promoters were determined to be effective at drivingexpression of sgRNA constructs and resulting in targeted integration ofdouble-strand, blunt-end DNA fragments at preselected sites in cornchromosomes. These experiments were conducted using the quantitativechromosome cutting assay in corn protoplast assay as described inexample 5 and example 6. The U6 promoters incorporated into the sgRNAconstructs were: a) the 397 bp corn chromosome 8 U6 promoter encoded bySEQ ID NO:7, b) the 397 bp ch1:ch8 chimeric U6 promoter encoded by SEQID NO:18, b) the 397 bp ch8:ch1 chimeric U6 promoter encoded by SEQ IDNO:19, and c) the 397 bp ch8:ch2:ch1:ch8 chimeric U6 promoter encoded bySEQ ID NO:20. The corn chromosomal target sites were L70a, L70c, andL70d, as described in example 5. The CRISPR/Cas9 system employed anexpression cassette with the S. pyogenes Cas9 modified to contain twoNLS sequences and an intron and encoded by SEQ ID NO:119. Thedouble-strand, blunt-end DNA fragment was encoded by SEQ ID NO:115 andSEQ ID NO:116.

In one assay, 48 hours post transfection of the corn protoplasts withthe CRISPR/Cas9 system components, the quantitative assay was done withTaqMan probes. The results (see FIG. 16A) indicate that the targetedintegration rate at target site L70a with the sgRNA construct containingthe ch8 U6 promoter or the sgRNA construct containing the chimericch1:ch8 U6 promoter resulted in about the equivalent percent targetintegration rate. The targeted integration rate at target site L70c, thesgRNA construct containing the chimeric ch8:ch1 U6 promoter resulted inabout double the target integration rate compared to sgRNA constructcontaining the ch8 U6 promoter. The targeted integration rate at targetsite L70d, the sgRNA construct containing the ch8 U6 promoter had highertargeted integration rate compared to the sgRNA construct containing thechimeric ch8:ch2:ch1:ch8 U6 promoter.

In another assay, 48 hours post transfection of the corn protoplastswith the CRISPR/Cas9 system components, the quantitative assay was donewith EvaGreen® (BioRad, Hercules, Calif.) intercalating dye. The results(see FIG. 16B) indicate that the targeted integration rate with thesgRNA construct containing the ch8 U6 promoter was nearly the same asthe targeted integration rate at target site L70a with the sgRNAconstruct containing the chimeric ch1:ch8 U6 promoter, and at targetsite L70c with the sgRNA construct containing the chimeric ch8:ch1 U6promoter, and at target site L70d with the sgRNA construct containingthe chimeric ch8:ch2:ch1:ch8 U6 promoter. These data indicate that thetargeted integration rate detected by the EvaGreen intercalating dye wasabout ten-fold higher compared to the targeted integration ratesdetected using MGB TaqMan probes. This discrepancy is mostly due todifferences in the chemistries of the assays. The TaqMan assay uses justtwo primers and an internal probe, of which one of the primers and theprobe are located on the inserted DNA fragment sequence. Unfortunately,the double-strand, blunt-end DNA fragment used in the transfection oftenundergo degradation by endogenous exonucleases in the protoplasts, andthis results in DNA fragment integrations with truncated sites where theTaqMan probe binds. These truncated integration events are notdetectable by the TaqMan assay. On the other hand, the binding site forthe TaqMan primer located within the inserted DNA fragment sequence islocated more internally in the inserted DNA fragment and remains intacteven in most truncated inserted DNA fragments. Since the assay with theintercalating Evagreen dye does not require the internal probe, and onlythe TaqMan primers, this assay is not affected by oligo degradations andthus can detect many more integrations than the TaqMan assay. Otherwise,the two methods of measuring the percent targeted integration showedsimilar patterns at the three chromosomal target sites and the threedifferent chimeric U6 promoters driving sgRNA expression.

These results show that targeted integration rate at corn chromosomalsite L70c when the sgRNA construct contains the Ch8::Ch1 chimericpromoter was slightly, to significantly higher compared to targetedintegration rate when the sgRNA construct contains the ch8 U6 promoter(FIGS. 16A and 16B). These results also show that the targetedintegration rate at corn chromosomal site L70a when the sgRNA constructcontains the Ch1::Ch8 chimeric promoter is about equivalent compared totargeted integration rate when the sgRNA construct contains the ch8 U6promoter (FIGS. 16A and 16B). Finally, these results show that thetargeted integration rate at corn chromosomal site L70d when the sgRNAconstruct contains the ch8:ch2:ch1:ch8 chimeric promoter was lowercompared to the targeted integration rate when the sgRNA constructcontains the ch8 U6 promoter (FIGS. 16A and 16B). In conclusion, atleast two of the three chimeric promoters were as good as, or betterthan, the best non-chimeric promoter in corn. These will have utility inmultiplex targeting experiments, where the diversity of expressionelements is indispensable.

Example 15

Targeted mutation in Tomato Invertase Inhibitor The CRISPR/Cas9 systemwas used to knock out the apoplastic invertase inhibitor gene of tomato(INVINH1) by introducing targeted frameshift point mutations followingimperfect repair of the targeted double-strand breaks by NHEJ. In anearlier study, knock-down of this gene by RNAi showed elevated fruitsugar content and increased seed weight (Jin et al. Plant Cell21:2072-2089, 2009). Reducing or eliminating the invertase inhibitoractivity by either targeted mutagenesis or RNA interference is useful toimprove yield and/or quality traits in other crop species too (Braun etal. J Exp Bot 65: 1713-1735, 2014).

For these experiments tomato protoplasts were transfected with anexpression construct containing a cassette encoding the SpCas9 with oneNLS at the C-terminus (SEQ ID NO:28), and one expression constructencoding an sgRNA cassette where expression was driven by one of 4separate tomato U6 promoters: promoter 1 encoded by SEQ ID NO:146 (whichis a fragment of SEQ ID NO:10), promoter 2 encoded by SEQ ID NO:147(which is a fragment of SEQ ID NO:11), promoter 3 encoded by SEQ IDNO:148 (which is a fragment of SEQ ID NO:9), or promoter 4 encoded bySEQ ID NO:149. The sgRNA were targeted to an invertase inhibitor site(site 1) without a Sm1I site or to a site (labeled site 2) in theinvertase inhibitor gene with a Sm1I restriction endonuclease site. Thesite 2 sgRNA is encoded by SEQ ID NO:150. The CRISPR/Cas9 cleavage sitewithin target site 2 contains a Sm1I restriction endonuclease site. UponCRISPR/Cas9 induced double-strand break at target site 2, the NHEJrepair will result in indels at this site, thus effectively removing theSmlI restriction endonuclease site. This mutation of the SmlI site wasleveraged during the screening for targeted events by amplifying a 380bp amplicon (SEQ ID NO:159) and subjecting the PCR amplicon to digestionwith SmlI. If the Sm1I site was not mutated, then the amplicon would bedigested into two fragments of 181 bp and 199 bp. If the Sm1I site wasmutated, then the PCR amplicon would not be digested. This PCR scheme isillustrated in FIG. 17A.

Tomato protoplasts were transfected with the CRISPR/Cas9 systemtargeting the tomato invertase inhibitor and harvested 48 hours laterand genomic DNA extracted. Negative control for the CRISPR/Cas9 systemwas omission of the expression construct encoding the Cas9 endonuclease.A negative control for the target site was use of a sgRNA to target site1, and it is not expected that the Sm1I site will be mutated with thissgRNA. PCR amplification was done with primers SEQ ID NO:157 and SEQ IDNO:158 and the resulting PCR amplicons were either undigested ordigested with Sm1I. The reactions were run on agarose gels and theresults are shown in FIG. 17B. The negative controls of sgRNA to targetsite 1 and the omission of Cas9 endonuclease resulted only in PCRamplicons with the Sm1I site intact. When the sgRNA was for target site2, the Sm1I site was mutated when the sgRNA cassette contained tomato U6promoter 1, or tomato U6 promoter 2, or tomato U6 promoter 3, asevidenced by the full-length PCR amplicons (see FIG. 17B, arrows showingamplicons without a Sm1I site). The sgRNA construct targeting site 2 andwith U6 promoter 4 apparently did not show targeting.

To confirm that the PCR amplicons without a Sm1I site were indeed due toCRISPR/Cas9 induced NHEJ mutation, these apparent mutated amplicons weregel-purified and pooled, and then they were sequenced. The multiplesequence alignment in FIG. 17C shows that these PCR amplicons without aSm1I site were from the target site 2 of the tomato invertase inhibitorand contained indels, consistent with CRISPR/Cas9 induced mutation.Specifically, in the multiple sequence alignment, SEQ ID NO:151represents a region of the PCR amplicon (SEQ ID NO:159) without amutation. SEQ ID NOs:152 and 153 illustrate indels where there was a 1bp insertion at the cleavage site. SEQ ID NO:154 illustrates an indelwith a 3 bp deletion at the cleavage site. SEQ ID NO:155 illustrates anindel with a 4 bp deletion at the cleavage site. SEQ ID NO:156illustrates an indel with a 6 bp deletion at the cleavage site. Inconclusion, these results indicate that the CRISPR/Cas9 system usingtomato U6 promoter 1 (SEQ ID NO:146), or tomato U6 promoter 2 (SEQ IDNO:147), or tomato U6 promoter 3 (SEQ ID NO:148) to drive sgRNA inducesmutation at the tomato invertase inhibitor gene target site 2.

Example 16 Promoters to Drive sgRNA Expression

To identify and select additional promoters which would be useful todrive expression of sgRNAs from expression cassettes introduced intodicots and monocots, RNA polymerase II (Pol II) and RNA polymerase III(Pol III) promoters (SEQ ID NOs:160-201 and SEQ ID NOs:247-283) wereidentified by comparing the sequence encoding U6, U3, U5, U2 and 7SLsmall nuclear RNA (snRNA) against soy and corn genomes using BLAST (seeTable 10). From regions of this bioinformatic alignment, 200 or morenucleotides immediately upstream of the 5′ end of the coding region ofthe respective snRNA was used for testing as putative promoters fordriving expression of sgRNA from expression cassettes introduced intoplant cells.

TABLE 10 SEQ ID NO of putative promotersequence upstream of the snRNA genes and source (tomato or soy or corn).Promoter + Promoter Pro- GUS + SEQ moter Terminator ID NO: snRNA SourceSEQ ID NO Terminator 148 Promoter tomato 202 poly(T)7 3 160 SoyU6a soy203 poly(T)7 161 SoyU6c soy 204 poly(T)7 162 SoyU6d soy 205 poly(T)7 163SoyU6e soy 206 poly(T)7 164 SoyU6f soy 207 poly(T)7 165 SoyU6g soy 208poly(T)7 166 SoyU6i soy 209 poly(T)7 167 U3a soy 210 poly(T)7 168 U3bsoy 211 poly(T)7 169 U3c soy 212 poly(T)7 170 U3d soy 213 poly(T)7 171U3e soy 214 poly(T)7 172 7SL_CR13 soy 215 poly(T)7 173 7SL_CR14 soy 216poly(T)7 174 7SL_CR10 soy 217 poly(T)7 175 7SLCR01 corn 218 poly(T)7 1767SLCR07 corn 219 poly(T)7 177 7SLCR09 corn 220 poly(T)7 178 U3CR02 corn221 poly(T)7 179 U3CR10 corn 222 poly(T)7 180 U3CR08 corn 223 poly(T)7181 U3CR08b corn 224 poly(T)7 182 U3CR05 corn 225 poly(T)7 183 U2snRNA_Pcorn 226 SEQ ID NO 237 184 U2snRNA_I corn 227 SEQ ID NO 237 185U2snRNA_B corn 228 SEQ ID NO 237 186 U2snRNA_G corn 229 SEQ ID NO 237187 U2snRNA_A corn 230 SEQ ID NO 237 188 U5snRNA_A corn 231 SEQ IDNO 237 189 U5snRNA_C corn 232 SEQ ID NO 237 190 U5snRNA_D corn 233SEQ ID NO 237 191 U5snRNA_E corn 234 SEQ ID NO 237 192 U2snRNA_C corn — —  193 U2snRNA_D corn —  —  194 U2snRNA_E corn —  —  195 U2snRNA_F corn—  —  196 U2snRNA_H corn —  —  197 U2snRNA_K corn —  —  198 U2snRNA_Lcorn —  —  199 U2snRNA_M corn —  —  200 U6Chr08 corn 235 poly(T)7 201U6Chr01 corn 236 poly(T)7 247 U2CR01a Soy —  —  248 U2CR01b Soy —  — 249 U2CR02 Soy —  —  250 U2CR03 Soy —  —  251 U2CR04 Soy —  —  252U2CR05a Soy —  —  253 U2CR05b Soy —  —  254 U2CR06a Soy —  —  255U2CR06b Soy —  —  256 U2CR06v Soy —  —  257 U2CR07 Soy —  —  258 U2CR08aSoy —  —  259 U2CR08b Soy —  —  260 U2CR08c Soy —  —  261 U2CR10a Soy — —  262 U2CR10b Soy —  —  263 U2CR10c Soy —  —  264 U2CR13 Soy —  —  265U2CR14 Soy —  —  266 U2CR15 Soy —  —  267 U2CR17a Soy —  —  268 U2CR17bSoy —  —  269 U2CR17c Soy —  —  270 U2CR17d Soy —  —  271 U2CR17e Soy — —  272 U2CR17f Soy —  —  273 U2CR19a Soy —  —  274 U2CR19b Soy —  —  275U2CR20 Soy —  —  276 U5CR07 Soy —  —  277 U5CR10 Soy —  —  278 U5CR10Soy —  —  279 U5CR15 Soy —  —  280 U5CR19 Soy —  —  281 U5CR20a Soy — —  282 U5CR20b Soy —  —  283 SoyU6b Soy —  — 

Example 17 Normalized RNA Transcript Level Assay

To assess the efficacy of the promoters listed in Table 10 to driveexpression of sgRNAs, a series of constructs were generated whichcontained a cassette encoding one of the putative promoters (SEQ IDNO:154, and SEQ ID NOs:160-201) operably linked to a 221 bp fragment ofa beta-glucuronidase (GUS) open reading frame and either a poly(T)7terminator for Pol III promoters (7SL, U6, and U3) or the sequence5′-ACAATTCAAAACAAGTTTTAT-3′ (SEQ ID NO:237) for the pol II U2 and U5promoters (Table 10). The recombinant constructs (0.5 pmol) containingthe promoter-GUS fragment fusions were transfected into soy cotyledonprotoplasts (SEQ ID NO:202-217 or corn leaf protoplasts (SEQ ID NO:218-236) along with 300 ng of a plasmid serving as a transformationcontrol encoding Renilla Luciferase (RLUC) expressed using the CaMVpromoter. The transfected protoplasts were harvested 18 hours aftertransfection and the RNA levels were measured via TaqMan assays using aprobe and primers complementary to the GUS fragment. Internal controlsused to normalized the TaqMan assay included (1) an 18S primerpair/probe set to control for RNA concentration and (2) RLUCluminescence as a transformation control.

In soy cotyledon protoplasts, all promoters tested resulted insignificantly higher normalized levels of GUS mRNA than the control (noGUS construct) (One-way ANOVA student t-test p value<0.05) (FIG. 18A).The lowest level of normalized GUS mRNA was with construct (SEQ IDNO:210) containing the U3a promoter (SEQ ID NO:167). The highest levelof normalized GUS mRNA was with construct (SEQ ID NO:210) containing the7SL_CR10 promoter (SEQ ID NO:174). The level of normalized GUS mRNA withall promoters tested with this assay ranged from 11-31 times higherexpression levels that the no DNA negative control. No one class ofpromoters (U6, U3, or 7SL) performed better than the other, although theU3 promoters were generally in the lower range of expression observed inthe experiment. U3 promoters have been successfully used by Liang et al.(J. Genetics and Genomics 41:63-68, 2014) to drive sgRNAs in corn. Thus,although these data indicate that the U3 promoters may be lower than U6or 7SL, they are still viable candidates to drive sgRNA expression insoy. These data suggest that any of the U6, U3, or 7SL promotersidentified here would be good candidates for making recombinantexpression constructs to drive expression of sgRNA in plant cells. Incorn leaf protoplast, all promoters tested resulted in significantlyhigher normalized levels of GUS mRNA compared to the control (One-wayANOVA student t-test p value<0.05) with values ranging from 26 fold to141 fold higher expression than the negative control (FIG. 18B). TheU6Chr08 promoter construct (SEQ ID NO:235) resulted in the highestnormalized levels of GUS mRNA expression, and U2snRNA_I promoterconstruct (SEQ ID NO:227) resulted in the lowest, with approximately a5.5-fold difference in normalized levels of GUS mRNA expression betweenthem. The U2snRNA_P promoter construct (SEQ ID NO:226) also stood out ashaving high normalized levels of GUS mRNA expression. All the remainingpromoters were within the same relative range having less than 2 folddifference between them (FIG. 18B). These data suggest that any of theU6, U3, 7S1, U2, or U5, promoters identified here would be goodcandidates for making recombinant expression constructs to driveexpression of sgRNA in plant cells.

Example 18 GUS Expression Assay for sgRNA Expression

To determine how the difference in sgRNA expression levels impact Cas9activity, an assay was used that relied on activating transcription froma minimal promoter upstream of the GUS open reading frame in a reporterconstruct transfected into corn leaf protoplasts. For this assay, a Cas9nuclease from S. thermophilus was mutated at amino acid positions D9Aand H599A of the native protein sequence, effectively creating a Cas9without endonuclease cleavage activity (also referred to as a ‘deadCas9’). Additionally, this dead Cas9 was modified to encode one NLSdomain (SEQ ID NO:120) at amino acid positions 2-11 of SEQ ID NO:239 andan activation domain from a TALE protein from amino acid positions1135-1471 of SEQ ID NO:239. The polynucleotide sequence of the deadCas9, represented by SEQ ID NO:238, included an intron at positions507-695. A reporter construct was constructed where the uidA (GUS)reporter gene was driven by a minimal CaMV promoter with three adjacentsgRNA binding sites (SEQ ID NO:240) at nucleotide positions 80-98,117-135, and 154-172 of the sequence SEQ ID NO:246. Also constructedwere a set of sgRNA (based on the sgRNA of Cong et al. 2013 Science339:819) expression constructs that consisted of the one of thepromoters from each class of snRNA genes, namely U6, 7SL, U2, U5, and U3(Table 11) and which would target the dead Cas9-TALE-AD to one or moreof the sgRNA binding sites of the GUS reporter construct. The U6 and 7SLpromoters normally initiate transcription on a G, and the U2, U5 and U3promoters normally initiate transcription on an A. To ensure propertranscription initiation of the sgRNA, for constructs with either a U6or 7SL promoter, a G was inserted between the promoter and spacersequence. For constructs with a U2, U5 or U3 promoter, an A was insertedbetween the promoter and spacer sequence. When the dead Cas9-TALE-AD andsgRNA complex binds the GUS reporter construct, the TALE activationdomain functions as a transcription factor activating the minimal CaMVpromoter resulting in higher expression of the GUS transcript, andultimately higher levels of GUS protein expression.

TABLE 11 SEQ ID NO corresponding to sgRNA expression constructsPromoter + sgRNA Promoter SEQ ID NO: Promoter SEQ ID NO: 241 U6Chr08 200242 7SLCR07 176 243 U2snRNA_I 184 244 U5snRNA_E 191 245 U3CR08b 181

For the assay, corn leaf protoplasts were transfected with 0.8 pmol ofdead Cas9-TALE-AD expression cassette, 0.5 pmol of the GUS expressioncassette, 1.6 pmol of one of the sgRNA expression cassettes, 650 ng ofLuciferase expression cassette, and 300 ng of Renilla Luciferase (RLUC)expression cassette. The transfected protoplasts were harvested 18 hourslater and GUS activity was measured using the4-methylumbelliferyl-beta-D-glucuronide (MUG, Sigma, St. Louis, Mo.)fluorimetric assay, and luciferase and RLUC activity was measured andused as control to normalize relative to transfection controls. Theactivity of GUS is a readout of the how often the dead Cas9-TALE-ADbinds to the reporter plasmid. Each class of snRNA promoter drivingsgRNA gave higher normalized GUS activity compared to the control (FIG.19 ). The U3CR08b (U3_8B in FIG. 19 ) promoter resulted in the highestnormalized GUS activity of about 10× over control. The two promoters7SLCR07 and U6Chr08 both gave about the same normalized GUS activity ofabout 4× over control. The two promoters U2snRNA_I (Us_I in FIG. 19 )and U5snRNA_E (U5_e in FIG. 19 ) were each at or slightly above 2× overcontrol for normalized GUS activity. These results indicate that the7SL, U6, U3, U2, and U5 snRNA promoters may be good to excellentcandidates for use in sgRNA expression constructs for CRISPR/Cas9 systemuseful in genome modification.

The differences in normalized GUS expression observed using the deadCas9-TALE-AD assay do not mirror the normalized GUS mRNA levels shown inthe corn leaf protoplast assay detailed in Example 17.

1. A recombinant DNA construct comprising a snRNA promoter selected fromthe group consisting of: a U6 promoter, a U3 promoter, a U2 promoter, aU5 promoter, and a 7SL promoter; operably linked to: (i) a sequenceencoding a single-guide RNA (sgRNA), or (ii) a sequence specifying anon-coding RNA; and wherein the sequence of said snRNA promotercomprises SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4, SEQ IDNO:5, SEQ ID NO:6, SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ IDNO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15, SEQ ID NO:16, SEQ IDNO:17, SEQ ID NO:18, SEQ ID NO:19, SEQ ID NO:20, SEQ ID NOs:146-149, SEQID NOs:160-165, 167-201, or SEQ ID NOs:247-283; or a fragment thereof,wherein the fragment is at least 140 bp in length. 2.-8. (canceled) 9.The recombinant DNA construct of claim 1, further comprising atranscription termination sequence.
 10. The recombinant DNA construct ofclaim 1, further comprising a sequence encoding a promoter operablylinked to a sequence encoding a clustered, regularly interspaced, shortpalindromic repeats (CRISPR)-associated Cas endonuclease gene product.11. The recombinant DNA construct of claim 10, wherein the Casendonuclease gene product is further operably linked to a nuclearlocalization sequence (NLS).
 12. The recombinant DNA construct of claim10, wherein the sequence encoding said Cas endonuclease is selected fromthe group consisting of SEQ ID NO:27, SEQ ID NO:68, and SEQ ID NO:97,SEQ ID NO:119, and SEQ ID NO:136.
 13. (canceled)
 14. The recombinant DNAconstruct of claim 1, wherein the non-coding RNA is selected from thegroup consisting of: a microRNA (miRNA), a miRNA precursor, a smallinterfering RNA (siRNA), a small RNA (22-26 nt in length) and precursorencoding same, a heterochromatic siRNA (hc-siRNA), a Piwi-interactingRNA (piRNA), a hairpin double strand RNA (hairpin dsRNA), a trans-actingsiRNA (ta-siRNA), and a naturally occurring antisense siRNA (nat-siRNA).15.-19. (canceled)
 20. A cell comprising the recombinant DNA constructof claim
 1. 21. The cell of claim 20, wherein the cell is a plant cell.22. A method of introducing a double-strand break in the genome of acell, comprising introducing in said cell: a) at least one recombinantDNA construct of claim 1; and b) a second recombinant DNA constructcomprising a sequence encoding a promoter operably linked to a sequenceencoding a clustered, regularly interspaced, short palindromic repeats(CRISPR)-associated Cas endonuclease gene product operably linked to anuclear localization sequence (NLS). 23.-24. (canceled)
 25. The methodof claim 22, wherein the sequence encoding said Cas endonuclease isselected from the group consisting of SEQ ID NO:27, SEQ ID NO:68, andSEQ ID NO:97, SEQ ID NO:119, and SEQ ID NO:136.
 26. A method ofintroducing a double-strand break in the genome of a cell, comprisingintroducing to said cell at least one recombinant DNA construct of claim10. 27.-28. (canceled)
 29. The method of claim 26, wherein the sequenceencoding the Cas endonuclease is selected from the group consisting ofSEQ ID NO:27, SEQ ID NO:68, and SEQ ID NO:97, SEQ ID NO:119, and SEQ IDNO:136. 30.-59. (canceled)
 60. A method of genome modificationcomprising: a) introducing a double-strand break in the genome of aplant cell by the method according to claim 22; and b) introducing intosaid plant cell a recombinant blunt-end double-strand DNA fragment,wherein said recombinant blunt-end double-strand DNA fragment isincorporated into said double strand break by endogenous DNA repair. 61.A method of genome modification comprising: a) introducing adouble-strand break in the genome of a plant cell by the methodaccording to claim 26; and b) introducing into said plant cell arecombinant blunt-end double-strand DNA fragment, wherein saidrecombinant blunt-end double-strand DNA fragment is incorporated intosaid double strand break by endogenous DNA repair. 62.-97. (canceled)